Re: Linux 4.4.180

From: Greg KH
Date: Fri May 17 2019 - 03:37:28 EST


diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 50f95689ab38..e4cd3be77663 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -277,6 +277,8 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/spectre_v1
/sys/devices/system/cpu/vulnerabilities/spectre_v2
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
+ /sys/devices/system/cpu/vulnerabilities/l1tf
+ /sys/devices/system/cpu/vulnerabilities/mds
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@xxxxxxxxxxxxxxx>
Description: Information about CPU vulnerabilities
diff --git a/Documentation/hw-vuln/mds.rst b/Documentation/hw-vuln/mds.rst
new file mode 100644
index 000000000000..3f92728be021
--- /dev/null
+++ b/Documentation/hw-vuln/mds.rst
@@ -0,0 +1,305 @@
+MDS - Microarchitectural Data Sampling
+======================================
+
+Microarchitectural Data Sampling is a hardware vulnerability which allows
+unprivileged speculative access to data which is available in various CPU
+internal buffers.
+
+Affected processors
+-------------------
+
+This vulnerability affects a wide range of Intel processors. The
+vulnerability is not present on:
+
+ - Processors from AMD, Centaur and other non Intel vendors
+
+ - Older processor models, where the CPU family is < 6
+
+ - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus)
+
+ - Intel processors which have the ARCH_CAP_MDS_NO bit set in the
+ IA32_ARCH_CAPABILITIES MSR.
+
+Whether a processor is affected or not can be read out from the MDS
+vulnerability file in sysfs. See :ref:`mds_sys_info`.
+
+Not all processors are affected by all variants of MDS, but the mitigation
+is identical for all of them so the kernel treats them as a single
+vulnerability.
+
+Related CVEs
+------------
+
+The following CVE entries are related to the MDS vulnerability:
+
+ ============== ===== ===================================================
+ CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling
+ CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling
+ CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling
+ CVE-2019-11091 MDSUM Microarchitectural Data Sampling Uncacheable Memory
+ ============== ===== ===================================================
+
+Problem
+-------
+
+When performing store, load, L1 refill operations, processors write data
+into temporary microarchitectural structures (buffers). The data in the
+buffer can be forwarded to load operations as an optimization.
+
+Under certain conditions, usually a fault/assist caused by a load
+operation, data unrelated to the load memory address can be speculatively
+forwarded from the buffers. Because the load operation causes a fault or
+assist and its result will be discarded, the forwarded data will not cause
+incorrect program execution or state changes. But a malicious operation
+may be able to forward this speculative data to a disclosure gadget which
+allows in turn to infer the value via a cache side channel attack.
+
+Because the buffers are potentially shared between Hyper-Threads cross
+Hyper-Thread attacks are possible.
+
+Deeper technical information is available in the MDS specific x86
+architecture section: :ref:`Documentation/x86/mds.rst <mds>`.
+
+
+Attack scenarios
+----------------
+
+Attacks against the MDS vulnerabilities can be mounted from malicious non
+priviledged user space applications running on hosts or guest. Malicious
+guest OSes can obviously mount attacks as well.
+
+Contrary to other speculation based vulnerabilities the MDS vulnerability
+does not allow the attacker to control the memory target address. As a
+consequence the attacks are purely sampling based, but as demonstrated with
+the TLBleed attack samples can be postprocessed successfully.
+
+Web-Browsers
+^^^^^^^^^^^^
+
+ It's unclear whether attacks through Web-Browsers are possible at
+ all. The exploitation through Java-Script is considered very unlikely,
+ but other widely used web technologies like Webassembly could possibly be
+ abused.
+
+
+.. _mds_sys_info:
+
+MDS system information
+-----------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current MDS
+status of the system: whether the system is vulnerable, and which
+mitigations are active. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/mds
+
+The possible values in this file are:
+
+ .. list-table::
+
+ * - 'Not affected'
+ - The processor is not vulnerable
+ * - 'Vulnerable'
+ - The processor is vulnerable, but no mitigation enabled
+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+ - The processor is vulnerable but microcode is not updated.
+
+ The mitigation is enabled on a best effort basis. See :ref:`vmwerv`
+ * - 'Mitigation: Clear CPU buffers'
+ - The processor is vulnerable and the CPU buffer clearing mitigation is
+ enabled.
+
+If the processor is vulnerable then the following information is appended
+to the above information:
+
+ ======================== ============================================
+ 'SMT vulnerable' SMT is enabled
+ 'SMT mitigated' SMT is enabled and mitigated
+ 'SMT disabled' SMT is disabled
+ 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
+ ======================== ============================================
+
+.. _vmwerv:
+
+Best effort mitigation mode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ If the processor is vulnerable, but the availability of the microcode based
+ mitigation mechanism is not advertised via CPUID the kernel selects a best
+ effort mitigation mode. This mode invokes the mitigation instructions
+ without a guarantee that they clear the CPU buffers.
+
+ This is done to address virtualization scenarios where the host has the
+ microcode update applied, but the hypervisor is not yet updated to expose
+ the CPUID to the guest. If the host has updated microcode the protection
+ takes effect otherwise a few cpu cycles are wasted pointlessly.
+
+ The state in the mds sysfs file reflects this situation accordingly.
+
+
+Mitigation mechanism
+-------------------------
+
+The kernel detects the affected CPUs and the presence of the microcode
+which is required.
+
+If a CPU is affected and the microcode is available, then the kernel
+enables the mitigation by default. The mitigation can be controlled at boot
+time via a kernel command line option. See
+:ref:`mds_mitigation_control_command_line`.
+
+.. _cpu_buffer_clear:
+
+CPU buffer clearing
+^^^^^^^^^^^^^^^^^^^
+
+ The mitigation for MDS clears the affected CPU buffers on return to user
+ space and when entering a guest.
+
+ If SMT is enabled it also clears the buffers on idle entry when the CPU
+ is only affected by MSBDS and not any other MDS variant, because the
+ other variants cannot be protected against cross Hyper-Thread attacks.
+
+ For CPUs which are only affected by MSBDS the user space, guest and idle
+ transition mitigations are sufficient and SMT is not affected.
+
+.. _virt_mechanism:
+
+Virtualization mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The protection for host to guest transition depends on the L1TF
+ vulnerability of the CPU:
+
+ - CPU is affected by L1TF:
+
+ If the L1D flush mitigation is enabled and up to date microcode is
+ available, the L1D flush mitigation is automatically protecting the
+ guest transition.
+
+ If the L1D flush mitigation is disabled then the MDS mitigation is
+ invoked explicit when the host MDS mitigation is enabled.
+
+ For details on L1TF and virtualization see:
+ :ref:`Documentation/hw-vuln//l1tf.rst <mitigation_control_kvm>`.
+
+ - CPU is not affected by L1TF:
+
+ CPU buffers are flushed before entering the guest when the host MDS
+ mitigation is enabled.
+
+ The resulting MDS protection matrix for the host to guest transition:
+
+ ============ ===== ============= ============ =================
+ L1TF MDS VMX-L1FLUSH Host MDS MDS-State
+
+ Don't care No Don't care N/A Not affected
+
+ Yes Yes Disabled Off Vulnerable
+
+ Yes Yes Disabled Full Mitigated
+
+ Yes Yes Enabled Don't care Mitigated
+
+ No Yes N/A Off Vulnerable
+
+ No Yes N/A Full Mitigated
+ ============ ===== ============= ============ =================
+
+ This only covers the host to guest transition, i.e. prevents leakage from
+ host to guest, but does not protect the guest internally. Guests need to
+ have their own protections.
+
+.. _xeon_phi:
+
+XEON PHI specific considerations
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The XEON PHI processor family is affected by MSBDS which can be exploited
+ cross Hyper-Threads when entering idle states. Some XEON PHI variants allow
+ to use MWAIT in user space (Ring 3) which opens an potential attack vector
+ for malicious user space. The exposure can be disabled on the kernel
+ command line with the 'ring3mwait=disable' command line option.
+
+ XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
+ before the CPU enters a idle state. As XEON PHI is not affected by L1TF
+ either disabling SMT is not required for full protection.
+
+.. _mds_smt_control:
+
+SMT control
+^^^^^^^^^^^
+
+ All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
+ means on CPUs which are affected by MFBDS or MLPDS it is necessary to
+ disable SMT for full protection. These are most of the affected CPUs; the
+ exception is XEON PHI, see :ref:`xeon_phi`.
+
+ Disabling SMT can have a significant performance impact, but the impact
+ depends on the type of workloads.
+
+ See the relevant chapter in the L1TF mitigation documentation for details:
+ :ref:`Documentation/hw-vuln/l1tf.rst <smt_control>`.
+
+
+.. _mds_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the MDS mitigations at boot
+time with the option "mds=". The valid arguments for this option are:
+
+ ============ =============================================================
+ full If the CPU is vulnerable, enable all available mitigations
+ for the MDS vulnerability, CPU buffer clearing on exit to
+ userspace and when entering a VM. Idle transitions are
+ protected as well if SMT is enabled.
+
+ It does not automatically disable SMT.
+
+ off Disables MDS mitigations completely.
+
+ ============ =============================================================
+
+Not specifying this option is equivalent to "mds=full".
+
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace
+^^^^^^^^^^^^^^^^^^^^
+
+ If all userspace applications are from a trusted source and do not
+ execute untrusted code which is supplied externally, then the mitigation
+ can be disabled.
+
+
+2. Virtualization with trusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The same considerations as above versus trusted user space apply.
+
+3. Virtualization with untrusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The protection depends on the state of the L1TF mitigations.
+ See :ref:`virt_mechanism`.
+
+ If the MDS mitigation is enabled and SMT is disabled, guest to host and
+ guest to guest attacks are prevented.
+
+.. _mds_default_mitigations:
+
+Default mitigations
+-------------------
+
+ The kernel default mitigations for vulnerable processors are:
+
+ - Enable CPU buffer clearing
+
+ The kernel does not by default enforce the disabling of SMT, which leaves
+ SMT systems vulnerable when running untrusted code. The same rationale as
+ for L1TF applies.
+ See :ref:`Documentation/hw-vuln//l1tf.rst <default_mitigations>`.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index da515c535e62..175d57049168 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2035,6 +2035,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Format: <first>,<last>
Specifies range of consoles to be captured by the MDA.

+ mds= [X86,INTEL]
+ Control mitigation for the Micro-architectural Data
+ Sampling (MDS) vulnerability.
+
+ Certain CPUs are vulnerable to an exploit against CPU
+ internal buffers which can forward information to a
+ disclosure gadget under certain conditions.
+
+ In vulnerable processors, the speculatively
+ forwarded data can be used in a cache side channel
+ attack, to access data to which the attacker does
+ not have direct access.
+
+ This parameter controls the MDS mitigation. The
+ options are:
+
+ full - Enable MDS mitigation on vulnerable CPUs
+ off - Unconditionally disable MDS mitigation
+
+ Not specifying this option is equivalent to
+ mds=full.
+
+ For details see: Documentation/hw-vuln/mds.rst
+
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
Amount of memory to be used when the kernel is not able
to see the whole system memory or for test.
@@ -2149,6 +2173,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
in the "bleeding edge" mini2440 support kernel at
http://repo.or.cz/w/linux-2.6/mini2440.git

+ mitigations=
+ [X86] Control optional mitigations for CPU
+ vulnerabilities. This is a set of curated,
+ arch-independent options, each of which is an
+ aggregation of existing arch-specific options.
+
+ off
+ Disable all optional CPU mitigations. This
+ improves system performance, but it may also
+ expose users to several CPU vulnerabilities.
+ Equivalent to: nopti [X86]
+ nospectre_v2 [X86]
+ spectre_v2_user=off [X86]
+ spec_store_bypass_disable=off [X86]
+ mds=off [X86]
+
+ auto (default)
+ Mitigate all CPU vulnerabilities, but leave SMT
+ enabled, even if it's vulnerable. This is for
+ users who don't want to be surprised by SMT
+ getting disabled across kernel upgrades, or who
+ have other ways of avoiding SMT-based attacks.
+ Equivalent to: (default behavior)
+
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
parameter allows control of the logging verbosity for
@@ -2450,7 +2498,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.

nohugeiomap [KNL,x86] Disable kernel huge I/O mappings.

- nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2
+ nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
+ check bypass). With this option data leaks are possible
+ in the system.
+
+ nospectre_v2 [X86,PPC_FSL_BOOK3E] Disable all mitigations for the Spectre variant 2
(indirect branch prediction) vulnerability. System may
allow data leaks with this option, which is equivalent
to spectre_v2=off.
@@ -3600,9 +3652,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.

spectre_v2= [X86] Control mitigation of Spectre variant 2
(indirect branch speculation) vulnerability.
+ The default operation protects the kernel from
+ user space attacks.

- on - unconditionally enable
- off - unconditionally disable
+ on - unconditionally enable, implies
+ spectre_v2_user=on
+ off - unconditionally disable, implies
+ spectre_v2_user=off
auto - kernel detects whether your CPU model is
vulnerable

@@ -3612,6 +3668,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
CONFIG_RETPOLINE configuration option, and the
compiler with which the kernel was built.

+ Selecting 'on' will also enable the mitigation
+ against user space to user space task attacks.
+
+ Selecting 'off' will disable both the kernel and
+ the user space protections.
+
Specific mitigations can also be selected manually:

retpoline - replace indirect branches
@@ -3621,6 +3683,48 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Not specifying this option is equivalent to
spectre_v2=auto.

+ spectre_v2_user=
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability between
+ user space tasks
+
+ on - Unconditionally enable mitigations. Is
+ enforced by spectre_v2=on
+
+ off - Unconditionally disable mitigations. Is
+ enforced by spectre_v2=off
+
+ prctl - Indirect branch speculation is enabled,
+ but mitigation can be enabled via prctl
+ per thread. The mitigation control state
+ is inherited on fork.
+
+ prctl,ibpb
+ - Like "prctl" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different user
+ space processes.
+
+ seccomp
+ - Same as "prctl" above, but all seccomp
+ threads will enable the mitigation unless
+ they explicitly opt out.
+
+ seccomp,ibpb
+ - Like "seccomp" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different
+ user space processes.
+
+ auto - Kernel selects the mitigation depending on
+ the available CPU features and vulnerability.
+
+ Default mitigation:
+ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
spec_store_bypass_disable=
[HW] Control Speculative Store Bypass (SSB) Disable mitigation
(Speculative Store Bypass vulnerability)
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 2fb35658d151..709d24b4b533 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -387,6 +387,7 @@ tcp_min_rtt_wlen - INTEGER
minimum RTT when it is moved to a longer path (e.g., due to traffic
engineering). A longer window makes the filter more resistant to RTT
inflations such as transient congestion. The unit is seconds.
+ Possible values: 0 - 86400 (1 day)
Default: 300

tcp_moderate_rcvbuf - BOOLEAN
diff --git a/Documentation/spec_ctrl.txt b/Documentation/spec_ctrl.txt
index 32f3d55c54b7..c4dbe6f7cdae 100644
--- a/Documentation/spec_ctrl.txt
+++ b/Documentation/spec_ctrl.txt
@@ -92,3 +92,12 @@ Speculation misfeature controls
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0);
+
+- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes
+ (Mitigate Spectre V2 style attacks against user processes)
+
+ Invocations:
+ * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt
index 0a94ffe17ab6..b13e031beaa6 100644
--- a/Documentation/usb/power-management.txt
+++ b/Documentation/usb/power-management.txt
@@ -365,11 +365,15 @@ autosuspend the interface's device. When the usage counter is = 0
then the interface is considered to be idle, and the kernel may
autosuspend the device.

-Drivers need not be concerned about balancing changes to the usage
-counter; the USB core will undo any remaining "get"s when a driver
-is unbound from its interface. As a corollary, drivers must not call
-any of the usb_autopm_* functions after their disconnect() routine has
-returned.
+Drivers must be careful to balance their overall changes to the usage
+counter. Unbalanced "get"s will remain in effect when a driver is
+unbound from its interface, preventing the device from going into
+runtime suspend should the interface be bound to a driver again. On
+the other hand, drivers are allowed to achieve this balance by calling
+the ``usb_autopm_*`` functions even after their ``disconnect`` routine
+has returned -- say from within a work-queue routine -- provided they
+retain an active reference to the interface (via ``usb_get_intf`` and
+``usb_put_intf``).

Drivers using the async routines are responsible for their own
synchronization and mutual exclusion.
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
new file mode 100644
index 000000000000..534e9baa4e1d
--- /dev/null
+++ b/Documentation/x86/mds.rst
@@ -0,0 +1,225 @@
+Microarchitectural Data Sampling (MDS) mitigation
+=================================================
+
+.. _mds:
+
+Overview
+--------
+
+Microarchitectural Data Sampling (MDS) is a family of side channel attacks
+on internal buffers in Intel CPUs. The variants are:
+
+ - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
+ - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
+ - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
+ - Microarchitectural Data Sampling Uncacheable Memory (MDSUM) (CVE-2019-11091)
+
+MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
+dependent load (store-to-load forwarding) as an optimization. The forward
+can also happen to a faulting or assisting load operation for a different
+memory address, which can be exploited under certain conditions. Store
+buffers are partitioned between Hyper-Threads so cross thread forwarding is
+not possible. But if a thread enters or exits a sleep state the store
+buffer is repartitioned which can expose data from one thread to the other.
+
+MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
+L1 miss situations and to hold data which is returned or sent in response
+to a memory or I/O operation. Fill buffers can forward data to a load
+operation and also write data to the cache. When the fill buffer is
+deallocated it can retain the stale data of the preceding operations which
+can then be forwarded to a faulting or assisting load operation, which can
+be exploited under certain conditions. Fill buffers are shared between
+Hyper-Threads so cross thread leakage is possible.
+
+MLPDS leaks Load Port Data. Load ports are used to perform load operations
+from memory or I/O. The received data is then forwarded to the register
+file or a subsequent operation. In some implementations the Load Port can
+contain stale data from a previous operation which can be forwarded to
+faulting or assisting loads under certain conditions, which again can be
+exploited eventually. Load ports are shared between Hyper-Threads so cross
+thread leakage is possible.
+
+MDSUM is a special case of MSBDS, MFBDS and MLPDS. An uncacheable load from
+memory that takes a fault or assist can leave data in a microarchitectural
+structure that may later be observed using one of the same methods used by
+MSBDS, MFBDS or MLPDS.
+
+Exposure assumptions
+--------------------
+
+It is assumed that attack code resides in user space or in a guest with one
+exception. The rationale behind this assumption is that the code construct
+needed for exploiting MDS requires:
+
+ - to control the load to trigger a fault or assist
+
+ - to have a disclosure gadget which exposes the speculatively accessed
+ data for consumption through a side channel.
+
+ - to control the pointer through which the disclosure gadget exposes the
+ data
+
+The existence of such a construct in the kernel cannot be excluded with
+100% certainty, but the complexity involved makes it extremly unlikely.
+
+There is one exception, which is untrusted BPF. The functionality of
+untrusted BPF is limited, but it needs to be thoroughly investigated
+whether it can be used to create such a construct.
+
+
+Mitigation strategy
+-------------------
+
+All variants have the same mitigation strategy at least for the single CPU
+thread case (SMT off): Force the CPU to clear the affected buffers.
+
+This is achieved by using the otherwise unused and obsolete VERW
+instruction in combination with a microcode update. The microcode clears
+the affected CPU buffers when the VERW instruction is executed.
+
+For virtualization there are two ways to achieve CPU buffer
+clearing. Either the modified VERW instruction or via the L1D Flush
+command. The latter is issued when L1TF mitigation is enabled so the extra
+VERW can be avoided. If the CPU is not affected by L1TF then VERW needs to
+be issued.
+
+If the VERW instruction with the supplied segment selector argument is
+executed on a CPU without the microcode update there is no side effect
+other than a small number of pointlessly wasted CPU cycles.
+
+This does not protect against cross Hyper-Thread attacks except for MSBDS
+which is only exploitable cross Hyper-thread when one of the Hyper-Threads
+enters a C-state.
+
+The kernel provides a function to invoke the buffer clearing:
+
+ mds_clear_cpu_buffers()
+
+The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state
+(idle) transitions.
+
+As a special quirk to address virtualization scenarios where the host has
+the microcode updated, but the hypervisor does not (yet) expose the
+MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the
+hope that it might actually clear the buffers. The state is reflected
+accordingly.
+
+According to current knowledge additional mitigations inside the kernel
+itself are not required because the necessary gadgets to expose the leaked
+data cannot be controlled in a way which allows exploitation from malicious
+user space or VM guests.
+
+Kernel internal mitigation modes
+--------------------------------
+
+ ======= ============================================================
+ off Mitigation is disabled. Either the CPU is not affected or
+ mds=off is supplied on the kernel command line
+
+ full Mitigation is enabled. CPU is affected and MD_CLEAR is
+ advertised in CPUID.
+
+ vmwerv Mitigation is enabled. CPU is affected and MD_CLEAR is not
+ advertised in CPUID. That is mainly for virtualization
+ scenarios where the host has the updated microcode but the
+ hypervisor does not expose MD_CLEAR in CPUID. It's a best
+ effort approach without guarantee.
+ ======= ============================================================
+
+If the CPU is affected and mds=off is not supplied on the kernel command
+line then the kernel selects the appropriate mitigation mode depending on
+the availability of the MD_CLEAR CPUID bit.
+
+Mitigation points
+-----------------
+
+1. Return to user space
+^^^^^^^^^^^^^^^^^^^^^^^
+
+ When transitioning from kernel to user space the CPU buffers are flushed
+ on affected CPUs when the mitigation is not disabled on the kernel
+ command line. The migitation is enabled through the static key
+ mds_user_clear.
+
+ The mitigation is invoked in prepare_exit_to_usermode() which covers
+ most of the kernel to user space transitions. There are a few exceptions
+ which are not invoking prepare_exit_to_usermode() on return to user
+ space. These exceptions use the paranoid exit code.
+
+ - Non Maskable Interrupt (NMI):
+
+ Access to sensible data like keys, credentials in the NMI context is
+ mostly theoretical: The CPU can do prefetching or execute a
+ misspeculated code path and thereby fetching data which might end up
+ leaking through a buffer.
+
+ But for mounting other attacks the kernel stack address of the task is
+ already valuable information. So in full mitigation mode, the NMI is
+ mitigated on the return from do_nmi() to provide almost complete
+ coverage.
+
+ - Double fault (#DF):
+
+ A double fault is usually fatal, but the ESPFIX workaround, which can
+ be triggered from user space through modify_ldt(2) is a recoverable
+ double fault. #DF uses the paranoid exit path, so explicit mitigation
+ in the double fault handler is required.
+
+ - Machine Check Exception (#MC):
+
+ Another corner case is a #MC which hits between the CPU buffer clear
+ invocation and the actual return to user. As this still is in kernel
+ space it takes the paranoid exit path which does not clear the CPU
+ buffers. So the #MC handler repopulates the buffers to some
+ extent. Machine checks are not reliably controllable and the window is
+ extremly small so mitigation would just tick a checkbox that this
+ theoretical corner case is covered. To keep the amount of special
+ cases small, ignore #MC.
+
+ - Debug Exception (#DB):
+
+ This takes the paranoid exit path only when the INT1 breakpoint is in
+ kernel space. #DB on a user space address takes the regular exit path,
+ so no extra mitigation required.
+
+
+2. C-State transition
+^^^^^^^^^^^^^^^^^^^^^
+
+ When a CPU goes idle and enters a C-State the CPU buffers need to be
+ cleared on affected CPUs when SMT is active. This addresses the
+ repartitioning of the store buffer when one of the Hyper-Threads enters
+ a C-State.
+
+ When SMT is inactive, i.e. either the CPU does not support it or all
+ sibling threads are offline CPU buffer clearing is not required.
+
+ The idle clearing is enabled on CPUs which are only affected by MSBDS
+ and not by any other MDS variant. The other MDS variants cannot be
+ protected against cross Hyper-Thread attacks because the Fill Buffer and
+ the Load Ports are shared. So on CPUs affected by other variants, the
+ idle clearing would be a window dressing exercise and is therefore not
+ activated.
+
+ The invocation is controlled by the static key mds_idle_clear which is
+ switched depending on the chosen mitigation mode and the SMT state of
+ the system.
+
+ The buffer clear is only invoked before entering the C-State to prevent
+ that stale data from the idling CPU from spilling to the Hyper-Thread
+ sibling after the store buffer got repartitioned and all entries are
+ available to the non idle sibling.
+
+ When coming out of idle the store buffer is partitioned again so each
+ sibling has half of it available. The back from idle CPU could be then
+ speculatively exposed to contents of the sibling. The buffers are
+ flushed either on exit to user space or on VMENTER so malicious code
+ in user space or the guest cannot speculatively access them.
+
+ The mitigation is hooked into all variants of halt()/mwait(), but does
+ not cover the legacy ACPI IO-Port mechanism because the ACPI idle driver
+ has been superseded by the intel_idle driver around 2010 and is
+ preferred on all affected CPUs which are expected to gain the MD_CLEAR
+ functionality in microcode. Aside of that the IO-Port mechanism is a
+ legacy interface which is only used on older systems which are either
+ not affected or do not receive microcode updates anymore.
diff --git a/Makefile b/Makefile
index ee0a50b871b9..6023a9dbad59 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
VERSION = 4
PATCHLEVEL = 4
-SUBLEVEL = 179
+SUBLEVEL = 180
EXTRAVERSION =
NAME = Blurry Fish Butt

diff --git a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi
index d6d98d426384..cae04e806036 100644
--- a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi
@@ -90,6 +90,7 @@
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_enet>;
phy-mode = "rgmii";
+ phy-reset-duration = <10>; /* in msecs */
phy-reset-gpios = <&gpio3 23 GPIO_ACTIVE_LOW>;
phy-supply = <&vdd_eth_io_reg>;
status = "disabled";
diff --git a/arch/arm/mach-iop13xx/setup.c b/arch/arm/mach-iop13xx/setup.c
index 53c316f7301e..fe4932fda01d 100644
--- a/arch/arm/mach-iop13xx/setup.c
+++ b/arch/arm/mach-iop13xx/setup.c
@@ -300,7 +300,7 @@ static struct resource iop13xx_adma_2_resources[] = {
}
};

-static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64);
+static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(32);
static struct iop_adma_platform_data iop13xx_adma_0_data = {
.hw_id = 0,
.pool_size = PAGE_SIZE,
@@ -324,7 +324,7 @@ static struct platform_device iop13xx_adma_0_channel = {
.resource = iop13xx_adma_0_resources,
.dev = {
.dma_mask = &iop13xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop13xx_adma_0_data,
},
};
@@ -336,7 +336,7 @@ static struct platform_device iop13xx_adma_1_channel = {
.resource = iop13xx_adma_1_resources,
.dev = {
.dma_mask = &iop13xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop13xx_adma_1_data,
},
};
@@ -348,7 +348,7 @@ static struct platform_device iop13xx_adma_2_channel = {
.resource = iop13xx_adma_2_resources,
.dev = {
.dma_mask = &iop13xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop13xx_adma_2_data,
},
};
diff --git a/arch/arm/mach-iop13xx/tpmi.c b/arch/arm/mach-iop13xx/tpmi.c
index db511ec2b1df..116feb6b261e 100644
--- a/arch/arm/mach-iop13xx/tpmi.c
+++ b/arch/arm/mach-iop13xx/tpmi.c
@@ -152,7 +152,7 @@ static struct resource iop13xx_tpmi_3_resources[] = {
}
};

-u64 iop13xx_tpmi_mask = DMA_BIT_MASK(64);
+u64 iop13xx_tpmi_mask = DMA_BIT_MASK(32);
static struct platform_device iop13xx_tpmi_0_device = {
.name = "iop-tpmi",
.id = 0,
@@ -160,7 +160,7 @@ static struct platform_device iop13xx_tpmi_0_device = {
.resource = iop13xx_tpmi_0_resources,
.dev = {
.dma_mask = &iop13xx_tpmi_mask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
},
};

@@ -171,7 +171,7 @@ static struct platform_device iop13xx_tpmi_1_device = {
.resource = iop13xx_tpmi_1_resources,
.dev = {
.dma_mask = &iop13xx_tpmi_mask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
},
};

@@ -182,7 +182,7 @@ static struct platform_device iop13xx_tpmi_2_device = {
.resource = iop13xx_tpmi_2_resources,
.dev = {
.dma_mask = &iop13xx_tpmi_mask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
},
};

@@ -193,7 +193,7 @@ static struct platform_device iop13xx_tpmi_3_device = {
.resource = iop13xx_tpmi_3_resources,
.dev = {
.dma_mask = &iop13xx_tpmi_mask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
},
};

diff --git a/arch/arm/plat-iop/adma.c b/arch/arm/plat-iop/adma.c
index a4d1f8de3b5b..d9612221e484 100644
--- a/arch/arm/plat-iop/adma.c
+++ b/arch/arm/plat-iop/adma.c
@@ -143,7 +143,7 @@ struct platform_device iop3xx_dma_0_channel = {
.resource = iop3xx_dma_0_resources,
.dev = {
.dma_mask = &iop3xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop3xx_dma_0_data,
},
};
@@ -155,7 +155,7 @@ struct platform_device iop3xx_dma_1_channel = {
.resource = iop3xx_dma_1_resources,
.dev = {
.dma_mask = &iop3xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop3xx_dma_1_data,
},
};
@@ -167,7 +167,7 @@ struct platform_device iop3xx_aau_channel = {
.resource = iop3xx_aau_resources,
.dev = {
.dma_mask = &iop3xx_adma_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = (void *) &iop3xx_aau_data,
},
};
diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 8861c367d061..51c3737ddba7 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -645,7 +645,7 @@ static struct platform_device orion_xor0_shared = {
.resource = orion_xor0_shared_resources,
.dev = {
.dma_mask = &orion_xor_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = &orion_xor0_pdata,
},
};
@@ -706,7 +706,7 @@ static struct platform_device orion_xor1_shared = {
.resource = orion_xor1_shared_resources,
.dev = {
.dma_mask = &orion_xor_dmamask,
- .coherent_dma_mask = DMA_BIT_MASK(64),
+ .coherent_dma_mask = DMA_BIT_MASK(32),
.platform_data = &orion_xor1_pdata,
},
};
diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
index 87c697181d25..4faff3e77b25 100644
--- a/arch/mips/kernel/scall64-o32.S
+++ b/arch/mips/kernel/scall64-o32.S
@@ -126,7 +126,7 @@ trace_a_syscall:
subu t1, v0, __NR_O32_Linux
move a1, v0
bnez t1, 1f /* __NR_syscall at offset 0 */
- lw a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
+ ld a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
.set pop

1: jal syscall_trace_enter
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 58a1fa979655..01b6c00a7060 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -136,7 +136,7 @@ config PPC
select GENERIC_SMP_IDLE_THREAD
select GENERIC_CMOS_UPDATE
select GENERIC_TIME_VSYSCALL_OLD
- select GENERIC_CPU_VULNERABILITIES if PPC_BOOK3S_64
+ select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
select GENERIC_CLOCKEVENTS
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
@@ -162,6 +162,11 @@ config PPC
select ARCH_HAS_DMA_SET_COHERENT_MASK
select HAVE_ARCH_SECCOMP_FILTER

+config PPC_BARRIER_NOSPEC
+ bool
+ default y
+ depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
+
config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
new file mode 100644
index 000000000000..8944c55591cf
--- /dev/null
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -0,0 +1,21 @@
+#ifndef _ASM_POWERPC_ASM_PROTOTYPES_H
+#define _ASM_POWERPC_ASM_PROTOTYPES_H
+/*
+ * This file is for prototypes of C functions that are only called
+ * from asm, and any associated variables.
+ *
+ * Copyright 2016, Daniel Axtens, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ */
+
+/* Patch sites */
+extern s32 patch__call_flush_count_cache;
+extern s32 patch__flush_count_cache_return;
+
+extern long flush_count_cache;
+
+#endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index b9e16855a037..e7cb72cdb2ba 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -92,4 +92,25 @@ do { \
#define smp_mb__after_atomic() smp_mb()
#define smp_mb__before_spinlock() smp_mb()

+#ifdef CONFIG_PPC_BOOK3S_64
+#define NOSPEC_BARRIER_SLOT nop
+#elif defined(CONFIG_PPC_FSL_BOOK3E)
+#define NOSPEC_BARRIER_SLOT nop; nop
+#endif
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+/*
+ * Prevent execution of subsequent instructions until preceding branches have
+ * been fully resolved and are no longer executing speculatively.
+ */
+#define barrier_nospec_asm NOSPEC_BARRIER_FIXUP_SECTION; NOSPEC_BARRIER_SLOT
+
+// This also acts as a compiler barrier due to the memory clobber.
+#define barrier_nospec() asm (stringify_in_c(barrier_nospec_asm) ::: "memory")
+
+#else /* !CONFIG_PPC_BARRIER_NOSPEC */
+#define barrier_nospec_asm
+#define barrier_nospec()
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
#endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/powerpc/include/asm/code-patching-asm.h b/arch/powerpc/include/asm/code-patching-asm.h
new file mode 100644
index 000000000000..ed7b1448493a
--- /dev/null
+++ b/arch/powerpc/include/asm/code-patching-asm.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Copyright 2018, Michael Ellerman, IBM Corporation.
+ */
+#ifndef _ASM_POWERPC_CODE_PATCHING_ASM_H
+#define _ASM_POWERPC_CODE_PATCHING_ASM_H
+
+/* Define a "site" that can be patched */
+.macro patch_site label name
+ .pushsection ".rodata"
+ .balign 4
+ .global \name
+\name:
+ .4byte \label - .
+ .popsection
+.endm
+
+#endif /* _ASM_POWERPC_CODE_PATCHING_ASM_H */
diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 840a5509b3f1..a734b4b34d26 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -28,6 +28,8 @@ unsigned int create_cond_branch(const unsigned int *addr,
unsigned long target, int flags);
int patch_branch(unsigned int *addr, unsigned long target, int flags);
int patch_instruction(unsigned int *addr, unsigned int instr);
+int patch_instruction_site(s32 *addr, unsigned int instr);
+int patch_branch_site(s32 *site, unsigned long target, int flags);

int instr_is_relative_branch(unsigned int instr);
int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr);
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 9bddbec441b8..3ed536bec462 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -50,6 +50,27 @@
#define EX_PPR 88 /* SMT thread status register (priority) */
#define EX_CTR 96

+#define STF_ENTRY_BARRIER_SLOT \
+ STF_ENTRY_BARRIER_FIXUP_SECTION; \
+ nop; \
+ nop; \
+ nop
+
+#define STF_EXIT_BARRIER_SLOT \
+ STF_EXIT_BARRIER_FIXUP_SECTION; \
+ nop; \
+ nop; \
+ nop; \
+ nop; \
+ nop; \
+ nop
+
+/*
+ * r10 must be free to use, r13 must be paca
+ */
+#define INTERRUPT_TO_KERNEL \
+ STF_ENTRY_BARRIER_SLOT
+
/*
* Macros for annotating the expected destination of (h)rfid
*
@@ -66,16 +87,19 @@
rfid

#define RFI_TO_USER \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
rfid; \
b rfi_flush_fallback

#define RFI_TO_USER_OR_KERNEL \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
rfid; \
b rfi_flush_fallback

#define RFI_TO_GUEST \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
rfid; \
b rfi_flush_fallback
@@ -84,21 +108,25 @@
hrfid

#define HRFI_TO_USER \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
hrfid; \
b hrfi_flush_fallback

#define HRFI_TO_USER_OR_KERNEL \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
hrfid; \
b hrfi_flush_fallback

#define HRFI_TO_GUEST \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
hrfid; \
b hrfi_flush_fallback

#define HRFI_TO_UNKNOWN \
+ STF_EXIT_BARRIER_SLOT; \
RFI_FLUSH_SLOT; \
hrfid; \
b hrfi_flush_fallback
@@ -226,6 +254,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
#define __EXCEPTION_PROLOG_1(area, extra, vec) \
OPT_SAVE_REG_TO_PACA(area+EX_PPR, r9, CPU_FTR_HAS_PPR); \
OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \
+ INTERRUPT_TO_KERNEL; \
SAVE_CTR(r10, area); \
mfcr r9; \
extra(vec); \
@@ -512,6 +541,12 @@ label##_relon_hv: \
#define _MASKABLE_EXCEPTION_PSERIES(vec, label, h, extra) \
__MASKABLE_EXCEPTION_PSERIES(vec, label, h, extra)

+#define MASKABLE_EXCEPTION_OOL(vec, label) \
+ .globl label##_ool; \
+label##_ool: \
+ EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, vec); \
+ EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD);
+
#define MASKABLE_EXCEPTION_PSERIES(loc, vec, label) \
. = loc; \
.globl label##_pSeries; \
diff --git a/arch/powerpc/include/asm/feature-fixups.h b/arch/powerpc/include/asm/feature-fixups.h
index 7068bafbb2d6..145a37ab2d3e 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -184,6 +184,22 @@ label##3: \
FTR_ENTRY_OFFSET label##1b-label##3b; \
.popsection;

+#define STF_ENTRY_BARRIER_FIXUP_SECTION \
+953: \
+ .pushsection __stf_entry_barrier_fixup,"a"; \
+ .align 2; \
+954: \
+ FTR_ENTRY_OFFSET 953b-954b; \
+ .popsection;
+
+#define STF_EXIT_BARRIER_FIXUP_SECTION \
+955: \
+ .pushsection __stf_exit_barrier_fixup,"a"; \
+ .align 2; \
+956: \
+ FTR_ENTRY_OFFSET 955b-956b; \
+ .popsection;
+
#define RFI_FLUSH_FIXUP_SECTION \
951: \
.pushsection __rfi_flush_fixup,"a"; \
@@ -192,10 +208,34 @@ label##3: \
FTR_ENTRY_OFFSET 951b-952b; \
.popsection;

+#define NOSPEC_BARRIER_FIXUP_SECTION \
+953: \
+ .pushsection __barrier_nospec_fixup,"a"; \
+ .align 2; \
+954: \
+ FTR_ENTRY_OFFSET 953b-954b; \
+ .popsection;
+
+#define START_BTB_FLUSH_SECTION \
+955: \
+
+#define END_BTB_FLUSH_SECTION \
+956: \
+ .pushsection __btb_flush_fixup,"a"; \
+ .align 2; \
+957: \
+ FTR_ENTRY_OFFSET 955b-957b; \
+ FTR_ENTRY_OFFSET 956b-957b; \
+ .popsection;

#ifndef __ASSEMBLY__

+extern long stf_barrier_fallback;
+extern long __start___stf_entry_barrier_fixup, __stop___stf_entry_barrier_fixup;
+extern long __start___stf_exit_barrier_fixup, __stop___stf_exit_barrier_fixup;
extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup;
+extern long __start__btb_flush_fixup, __stop__btb_flush_fixup;

#endif

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 449bbb87c257..b57db9d09db9 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -292,10 +292,15 @@
#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2
#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3
#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4
+#define H_CPU_CHAR_BRANCH_HINTS_HONORED (1ull << 58) // IBM bit 5
+#define H_CPU_CHAR_THREAD_RECONFIG_CTRL (1ull << 57) // IBM bit 6
+#define H_CPU_CHAR_COUNT_CACHE_DISABLED (1ull << 56) // IBM bit 7
+#define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9

#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0
#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1
#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2
+#define H_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58) // IBM bit 5

#ifndef __ASSEMBLY__
#include <linux/types.h>
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 45e2aefece16..08e5df3395fa 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -199,8 +199,7 @@ struct paca_struct {
*/
u64 exrfi[13] __aligned(0x80);
void *rfi_flush_fallback_area;
- u64 l1d_flush_congruence;
- u64 l1d_flush_sets;
+ u64 l1d_flush_size;
#endif
};

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 7ab04fc59e24..faf1bb045dee 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -147,6 +147,7 @@
#define PPC_INST_LWSYNC 0x7c2004ac
#define PPC_INST_SYNC 0x7c0004ac
#define PPC_INST_SYNC_MASK 0xfc0007fe
+#define PPC_INST_ISYNC 0x4c00012c
#define PPC_INST_LXVD2X 0x7c000698
#define PPC_INST_MCRXR 0x7c000400
#define PPC_INST_MCRXR_MASK 0xfc0007fe
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 160bb2311bbb..d219816b3e19 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -821,4 +821,15 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,945)
.long 0x2400004c /* rfid */
#endif /* !CONFIG_PPC_BOOK3E */
#endif /* __ASSEMBLY__ */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define BTB_FLUSH(reg) \
+ lis reg,BUCSR_INIT@h; \
+ ori reg,reg,BUCSR_INIT@l; \
+ mtspr SPRN_BUCSR,reg; \
+ isync;
+#else
+#define BTB_FLUSH(reg)
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
#endif /* _ASM_POWERPC_PPC_ASM_H */
diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
index 2fef74b474f0..410ebee9e339 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -41,7 +41,7 @@
#if defined(CONFIG_PPC_BOOK3E_64)
#define MSR_64BIT MSR_CM

-#define MSR_ (MSR_ME | MSR_CE)
+#define MSR_ (MSR_ME | MSR_RI | MSR_CE)
#define MSR_KERNEL (MSR_ | MSR_64BIT)
#define MSR_USER32 (MSR_ | MSR_PR | MSR_EE)
#define MSR_USER64 (MSR_USER32 | MSR_64BIT)
diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h
new file mode 100644
index 000000000000..759597bf0fd8
--- /dev/null
+++ b/arch/powerpc/include/asm/security_features.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Security related feature bit definitions.
+ *
+ * Copyright 2018, Michael Ellerman, IBM Corporation.
+ */
+
+#ifndef _ASM_POWERPC_SECURITY_FEATURES_H
+#define _ASM_POWERPC_SECURITY_FEATURES_H
+
+
+extern unsigned long powerpc_security_features;
+extern bool rfi_flush;
+
+/* These are bit flags */
+enum stf_barrier_type {
+ STF_BARRIER_NONE = 0x1,
+ STF_BARRIER_FALLBACK = 0x2,
+ STF_BARRIER_EIEIO = 0x4,
+ STF_BARRIER_SYNC_ORI = 0x8,
+};
+
+void setup_stf_barrier(void);
+void do_stf_barrier_fixups(enum stf_barrier_type types);
+void setup_count_cache_flush(void);
+
+static inline void security_ftr_set(unsigned long feature)
+{
+ powerpc_security_features |= feature;
+}
+
+static inline void security_ftr_clear(unsigned long feature)
+{
+ powerpc_security_features &= ~feature;
+}
+
+static inline bool security_ftr_enabled(unsigned long feature)
+{
+ return !!(powerpc_security_features & feature);
+}
+
+
+// Features indicating support for Spectre/Meltdown mitigations
+
+// The L1-D cache can be flushed with ori r30,r30,0
+#define SEC_FTR_L1D_FLUSH_ORI30 0x0000000000000001ull
+
+// The L1-D cache can be flushed with mtspr 882,r0 (aka SPRN_TRIG2)
+#define SEC_FTR_L1D_FLUSH_TRIG2 0x0000000000000002ull
+
+// ori r31,r31,0 acts as a speculation barrier
+#define SEC_FTR_SPEC_BAR_ORI31 0x0000000000000004ull
+
+// Speculation past bctr is disabled
+#define SEC_FTR_BCCTRL_SERIALISED 0x0000000000000008ull
+
+// Entries in L1-D are private to a SMT thread
+#define SEC_FTR_L1D_THREAD_PRIV 0x0000000000000010ull
+
+// Indirect branch prediction cache disabled
+#define SEC_FTR_COUNT_CACHE_DISABLED 0x0000000000000020ull
+
+// bcctr 2,0,0 triggers a hardware assisted count cache flush
+#define SEC_FTR_BCCTR_FLUSH_ASSIST 0x0000000000000800ull
+
+
+// Features indicating need for Spectre/Meltdown mitigations
+
+// The L1-D cache should be flushed on MSR[HV] 1->0 transition (hypervisor to guest)
+#define SEC_FTR_L1D_FLUSH_HV 0x0000000000000040ull
+
+// The L1-D cache should be flushed on MSR[PR] 0->1 transition (kernel to userspace)
+#define SEC_FTR_L1D_FLUSH_PR 0x0000000000000080ull
+
+// A speculation barrier should be used for bounds checks (Spectre variant 1)
+#define SEC_FTR_BNDS_CHK_SPEC_BAR 0x0000000000000100ull
+
+// Firmware configuration indicates user favours security over performance
+#define SEC_FTR_FAVOUR_SECURITY 0x0000000000000200ull
+
+// Software required to flush count cache on context switch
+#define SEC_FTR_FLUSH_COUNT_CACHE 0x0000000000000400ull
+
+
+// Features enabled by default
+#define SEC_FTR_DEFAULT \
+ (SEC_FTR_L1D_FLUSH_HV | \
+ SEC_FTR_L1D_FLUSH_PR | \
+ SEC_FTR_BNDS_CHK_SPEC_BAR | \
+ SEC_FTR_FAVOUR_SECURITY)
+
+#endif /* _ASM_POWERPC_SECURITY_FEATURES_H */
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 7916b56f2e60..d299479c770b 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -8,6 +8,7 @@ extern void ppc_printk_progress(char *s, unsigned short hex);

extern unsigned int rtas_data;
extern unsigned long long memory_limit;
+extern bool init_mem_is_free;
extern unsigned long klimit;
extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);

@@ -36,8 +37,28 @@ enum l1d_flush_type {
L1D_FLUSH_MTTRIG = 0x8,
};

-void __init setup_rfi_flush(enum l1d_flush_type, bool enable);
+void setup_rfi_flush(enum l1d_flush_type, bool enable);
void do_rfi_flush_fixups(enum l1d_flush_type types);
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void setup_barrier_nospec(void);
+#else
+static inline void setup_barrier_nospec(void) { };
+#endif
+void do_barrier_nospec_fixups(bool enable);
+extern bool barrier_nospec_enabled;
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void do_barrier_nospec_fixups_range(bool enable, void *start, void *end);
+#else
+static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { };
+#endif
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+void setup_spectre_v2(void);
+#else
+static inline void setup_spectre_v2(void) {};
+#endif
+void do_btb_flush_fixups(void);

#endif /* !__ASSEMBLY__ */

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 05f1389228d2..e51ce5a0e221 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -269,6 +269,7 @@ do { \
__chk_user_ptr(ptr); \
if (!is_kernel_addr((unsigned long)__gu_addr)) \
might_fault(); \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__typeof__(*(ptr)))__gu_val; \
__gu_err; \
@@ -283,6 +284,7 @@ do { \
__chk_user_ptr(ptr); \
if (!is_kernel_addr((unsigned long)__gu_addr)) \
might_fault(); \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err; \
@@ -295,8 +297,10 @@ do { \
unsigned long __gu_val = 0; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
might_fault(); \
- if (access_ok(VERIFY_READ, __gu_addr, (size))) \
+ if (access_ok(VERIFY_READ, __gu_addr, (size))) { \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
+ } \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err; \
})
@@ -307,6 +311,7 @@ do { \
unsigned long __gu_val; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
__chk_user_ptr(ptr); \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err; \
@@ -323,8 +328,10 @@ extern unsigned long __copy_tofrom_user(void __user *to,
static inline unsigned long copy_from_user(void *to,
const void __user *from, unsigned long n)
{
- if (likely(access_ok(VERIFY_READ, from, n)))
+ if (likely(access_ok(VERIFY_READ, from, n))) {
+ barrier_nospec();
return __copy_tofrom_user((__force void __user *)to, from, n);
+ }
memset(to, 0, n);
return n;
}
@@ -359,21 +366,27 @@ static inline unsigned long __copy_from_user_inatomic(void *to,

switch (n) {
case 1:
+ barrier_nospec();
__get_user_size(*(u8 *)to, from, 1, ret);
break;
case 2:
+ barrier_nospec();
__get_user_size(*(u16 *)to, from, 2, ret);
break;
case 4:
+ barrier_nospec();
__get_user_size(*(u32 *)to, from, 4, ret);
break;
case 8:
+ barrier_nospec();
__get_user_size(*(u64 *)to, from, 8, ret);
break;
}
if (ret == 0)
return 0;
}
+
+ barrier_nospec();
return __copy_tofrom_user((__force void __user *)to, from, n);
}

@@ -400,6 +413,7 @@ static inline unsigned long __copy_to_user_inatomic(void __user *to,
if (ret == 0)
return 0;
}
+
return __copy_tofrom_user(to, (__force const void __user *)from, n);
}

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index ba336930d448..22ed3c32fca8 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_power.o
obj-$(CONFIG_PPC_BOOK3S_64) += mce.o mce_power.o
obj64-$(CONFIG_RELOCATABLE) += reloc_64.o
obj-$(CONFIG_PPC_BOOK3E_64) += exceptions-64e.o idle_book3e.o
+obj-$(CONFIG_PPC_BARRIER_NOSPEC) += security.o
obj-$(CONFIG_PPC64) += vdso64/
obj-$(CONFIG_ALTIVEC) += vecemu.o
obj-$(CONFIG_PPC_970_NAP) += idle_power4.o
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index d92705e3a0c1..de3c29c51503 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -245,8 +245,7 @@ int main(void)
DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce));
DEFINE(PACA_RFI_FLUSH_FALLBACK_AREA, offsetof(struct paca_struct, rfi_flush_fallback_area));
DEFINE(PACA_EXRFI, offsetof(struct paca_struct, exrfi));
- DEFINE(PACA_L1D_FLUSH_CONGRUENCE, offsetof(struct paca_struct, l1d_flush_congruence));
- DEFINE(PACA_L1D_FLUSH_SETS, offsetof(struct paca_struct, l1d_flush_sets));
+ DEFINE(PACA_L1D_FLUSH_SIZE, offsetof(struct paca_struct, l1d_flush_size));
#endif
DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id));
DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state));
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 3728e617e17e..609bc7d01f13 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -33,6 +33,7 @@
#include <asm/unistd.h>
#include <asm/ftrace.h>
#include <asm/ptrace.h>
+#include <asm/barrier.h>

/*
* MSR_KERNEL is > 0x10000 on 4xx/Book-E since it include MSR_CE.
@@ -340,6 +341,15 @@ syscall_dotrace_cont:
ori r10,r10,sys_call_table@l
slwi r0,r0,2
bge- 66f
+
+ barrier_nospec_asm
+ /*
+ * Prevent the load of the handler below (based on the user-passed
+ * system call number) being speculatively executed until the test
+ * against NR_syscalls and branch to .66f above has
+ * committed.
+ */
+
lwzx r10,r10,r0 /* Fetch system call handler [ptr] */
mtlr r10
addi r9,r1,STACK_FRAME_OVERHEAD
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 59be96917369..6d36a4fb4acf 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -25,6 +25,7 @@
#include <asm/page.h>
#include <asm/mmu.h>
#include <asm/thread_info.h>
+#include <asm/code-patching-asm.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/cputable.h>
@@ -36,6 +37,7 @@
#include <asm/hw_irq.h>
#include <asm/context_tracking.h>
#include <asm/tm.h>
+#include <asm/barrier.h>
#ifdef CONFIG_PPC_BOOK3S
#include <asm/exception-64s.h>
#else
@@ -75,6 +77,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
std r0,GPR0(r1)
std r10,GPR1(r1)
beq 2f /* if from kernel mode */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+START_BTB_FLUSH_SECTION
+ BTB_FLUSH(r10)
+END_BTB_FLUSH_SECTION
+#endif
ACCOUNT_CPU_USER_ENTRY(r10, r11)
2: std r2,GPR2(r1)
std r3,GPR3(r1)
@@ -177,6 +184,15 @@ system_call: /* label this so stack traces look sane */
clrldi r8,r8,32
15:
slwi r0,r0,4
+
+ barrier_nospec_asm
+ /*
+ * Prevent the load of the handler below (based on the user-passed
+ * system call number) being speculatively executed until the test
+ * against NR_syscalls and branch to .Lsyscall_enosys above has
+ * committed.
+ */
+
ldx r12,r11,r0 /* Fetch system call handler [ptr] */
mtctr r12
bctrl /* Call handler */
@@ -440,6 +456,57 @@ _GLOBAL(ret_from_kernel_thread)
li r3,0
b .Lsyscall_exit

+#ifdef CONFIG_PPC_BOOK3S_64
+
+#define FLUSH_COUNT_CACHE \
+1: nop; \
+ patch_site 1b, patch__call_flush_count_cache
+
+
+#define BCCTR_FLUSH .long 0x4c400420
+
+.macro nops number
+ .rept \number
+ nop
+ .endr
+.endm
+
+.balign 32
+.global flush_count_cache
+flush_count_cache:
+ /* Save LR into r9 */
+ mflr r9
+
+ .rept 64
+ bl .+4
+ .endr
+ b 1f
+ nops 6
+
+ .balign 32
+ /* Restore LR */
+1: mtlr r9
+ li r9,0x7fff
+ mtctr r9
+
+ BCCTR_FLUSH
+
+2: nop
+ patch_site 2b patch__flush_count_cache_return
+
+ nops 3
+
+ .rept 278
+ .balign 32
+ BCCTR_FLUSH
+ nops 7
+ .endr
+
+ blr
+#else
+#define FLUSH_COUNT_CACHE
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
/*
* This routine switches between two different tasks. The process
* state of one is saved on its kernel stack. Then the state
@@ -503,6 +570,8 @@ BEGIN_FTR_SECTION
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
#endif

+ FLUSH_COUNT_CACHE
+
#ifdef CONFIG_SMP
/* We need a sync somewhere here to make sure that if the
* previous task gets rescheduled on another CPU, it sees all
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 5cc93f0b52ca..48ec841ea1bf 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -295,7 +295,8 @@ ret_from_mc_except:
andi. r10,r11,MSR_PR; /* save stack pointer */ \
beq 1f; /* branch around if supervisor */ \
ld r1,PACAKSAVE(r13); /* get kernel stack coming from usr */\
-1: cmpdi cr1,r1,0; /* check if SP makes sense */ \
+1: type##_BTB_FLUSH \
+ cmpdi cr1,r1,0; /* check if SP makes sense */ \
bge- cr1,exc_##n##_bad_stack;/* bad stack (TODO: out of line) */ \
mfspr r10,SPRN_##type##_SRR0; /* read SRR0 before touching stack */

@@ -327,6 +328,30 @@ ret_from_mc_except:
#define SPRN_MC_SRR0 SPRN_MCSRR0
#define SPRN_MC_SRR1 SPRN_MCSRR1

+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define GEN_BTB_FLUSH \
+ START_BTB_FLUSH_SECTION \
+ beq 1f; \
+ BTB_FLUSH(r10) \
+ 1: \
+ END_BTB_FLUSH_SECTION
+
+#define CRIT_BTB_FLUSH \
+ START_BTB_FLUSH_SECTION \
+ BTB_FLUSH(r10) \
+ END_BTB_FLUSH_SECTION
+
+#define DBG_BTB_FLUSH CRIT_BTB_FLUSH
+#define MC_BTB_FLUSH CRIT_BTB_FLUSH
+#define GDBELL_BTB_FLUSH GEN_BTB_FLUSH
+#else
+#define GEN_BTB_FLUSH
+#define CRIT_BTB_FLUSH
+#define DBG_BTB_FLUSH
+#define MC_BTB_FLUSH
+#define GDBELL_BTB_FLUSH
+#endif
+
#define NORMAL_EXCEPTION_PROLOG(n, intnum, addition) \
EXCEPTION_PROLOG(n, intnum, GEN, addition##_GEN(n))

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 938a30fef031..10e7cec9553d 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -36,6 +36,7 @@ BEGIN_FTR_SECTION \
END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \
mr r9,r13 ; \
GET_PACA(r13) ; \
+ INTERRUPT_TO_KERNEL ; \
mfspr r11,SPRN_SRR0 ; \
0:

@@ -292,7 +293,9 @@ hardware_interrupt_hv:
. = 0x900
.globl decrementer_pSeries
decrementer_pSeries:
- _MASKABLE_EXCEPTION_PSERIES(0x900, decrementer, EXC_STD, SOFTEN_TEST_PR)
+ SET_SCRATCH0(r13)
+ EXCEPTION_PROLOG_0(PACA_EXGEN)
+ b decrementer_ool

STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)

@@ -319,6 +322,7 @@ system_call_pSeries:
OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR);
HMT_MEDIUM;
std r10,PACA_EXGEN+EX_R10(r13)
+ INTERRUPT_TO_KERNEL
OPT_SAVE_REG_TO_PACA(PACA_EXGEN+EX_PPR, r9, CPU_FTR_HAS_PPR);
mfcr r9
KVMTEST(0xc00)
@@ -607,6 +611,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)

.align 7
/* moved from 0xe00 */
+ MASKABLE_EXCEPTION_OOL(0x900, decrementer)
STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
STD_EXCEPTION_HV_OOL(0xe22, h_instr_storage)
@@ -1564,6 +1569,21 @@ power4_fixup_nap:
blr
#endif

+ .balign 16
+ .globl stf_barrier_fallback
+stf_barrier_fallback:
+ std r9,PACA_EXRFI+EX_R9(r13)
+ std r10,PACA_EXRFI+EX_R10(r13)
+ sync
+ ld r9,PACA_EXRFI+EX_R9(r13)
+ ld r10,PACA_EXRFI+EX_R10(r13)
+ ori 31,31,0
+ .rept 14
+ b 1f
+1:
+ .endr
+ blr
+
.globl rfi_flush_fallback
rfi_flush_fallback:
SET_SCRATCH0(r13);
@@ -1571,39 +1591,37 @@ rfi_flush_fallback:
std r9,PACA_EXRFI+EX_R9(r13)
std r10,PACA_EXRFI+EX_R10(r13)
std r11,PACA_EXRFI+EX_R11(r13)
- std r12,PACA_EXRFI+EX_R12(r13)
- std r8,PACA_EXRFI+EX_R13(r13)
mfctr r9
ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
- ld r11,PACA_L1D_FLUSH_SETS(r13)
- ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
- /*
- * The load adresses are at staggered offsets within cachelines,
- * which suits some pipelines better (on others it should not
- * hurt).
- */
- addi r12,r12,8
+ ld r11,PACA_L1D_FLUSH_SIZE(r13)
+ srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */
mtctr r11
DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */

/* order ld/st prior to dcbt stop all streams with flushing */
sync
-1: li r8,0
- .rept 8 /* 8-way set associative */
- ldx r11,r10,r8
- add r8,r8,r12
- xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not
- add r8,r8,r11 // Add 0, this creates a dependency on the ldx
- .endr
- addi r10,r10,128 /* 128 byte cache line */
+
+ /*
+ * The load adresses are at staggered offsets within cachelines,
+ * which suits some pipelines better (on others it should not
+ * hurt).
+ */
+1:
+ ld r11,(0x80 + 8)*0(r10)
+ ld r11,(0x80 + 8)*1(r10)
+ ld r11,(0x80 + 8)*2(r10)
+ ld r11,(0x80 + 8)*3(r10)
+ ld r11,(0x80 + 8)*4(r10)
+ ld r11,(0x80 + 8)*5(r10)
+ ld r11,(0x80 + 8)*6(r10)
+ ld r11,(0x80 + 8)*7(r10)
+ addi r10,r10,0x80*8
bdnz 1b

mtctr r9
ld r9,PACA_EXRFI+EX_R9(r13)
ld r10,PACA_EXRFI+EX_R10(r13)
ld r11,PACA_EXRFI+EX_R11(r13)
- ld r12,PACA_EXRFI+EX_R12(r13)
- ld r8,PACA_EXRFI+EX_R13(r13)
GET_SCRATCH0(r13);
rfid

@@ -1614,39 +1632,37 @@ hrfi_flush_fallback:
std r9,PACA_EXRFI+EX_R9(r13)
std r10,PACA_EXRFI+EX_R10(r13)
std r11,PACA_EXRFI+EX_R11(r13)
- std r12,PACA_EXRFI+EX_R12(r13)
- std r8,PACA_EXRFI+EX_R13(r13)
mfctr r9
ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
- ld r11,PACA_L1D_FLUSH_SETS(r13)
- ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
- /*
- * The load adresses are at staggered offsets within cachelines,
- * which suits some pipelines better (on others it should not
- * hurt).
- */
- addi r12,r12,8
+ ld r11,PACA_L1D_FLUSH_SIZE(r13)
+ srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */
mtctr r11
DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */

/* order ld/st prior to dcbt stop all streams with flushing */
sync
-1: li r8,0
- .rept 8 /* 8-way set associative */
- ldx r11,r10,r8
- add r8,r8,r12
- xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not
- add r8,r8,r11 // Add 0, this creates a dependency on the ldx
- .endr
- addi r10,r10,128 /* 128 byte cache line */
+
+ /*
+ * The load adresses are at staggered offsets within cachelines,
+ * which suits some pipelines better (on others it should not
+ * hurt).
+ */
+1:
+ ld r11,(0x80 + 8)*0(r10)
+ ld r11,(0x80 + 8)*1(r10)
+ ld r11,(0x80 + 8)*2(r10)
+ ld r11,(0x80 + 8)*3(r10)
+ ld r11,(0x80 + 8)*4(r10)
+ ld r11,(0x80 + 8)*5(r10)
+ ld r11,(0x80 + 8)*6(r10)
+ ld r11,(0x80 + 8)*7(r10)
+ addi r10,r10,0x80*8
bdnz 1b

mtctr r9
ld r9,PACA_EXRFI+EX_R9(r13)
ld r10,PACA_EXRFI+EX_R10(r13)
ld r11,PACA_EXRFI+EX_R11(r13)
- ld r12,PACA_EXRFI+EX_R12(r13)
- ld r8,PACA_EXRFI+EX_R13(r13)
GET_SCRATCH0(r13);
hrfid

diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index a620203f7de3..7b98c7351f6c 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -31,6 +31,16 @@
*/
#define THREAD_NORMSAVE(offset) (THREAD_NORMSAVES + (offset * 4))

+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define BOOKE_CLEAR_BTB(reg) \
+START_BTB_FLUSH_SECTION \
+ BTB_FLUSH(reg) \
+END_BTB_FLUSH_SECTION
+#else
+#define BOOKE_CLEAR_BTB(reg)
+#endif
+
+
#define NORMAL_EXCEPTION_PROLOG(intno) \
mtspr SPRN_SPRG_WSCRATCH0, r10; /* save one register */ \
mfspr r10, SPRN_SPRG_THREAD; \
@@ -42,6 +52,7 @@
andi. r11, r11, MSR_PR; /* check whether user or kernel */\
mr r11, r1; \
beq 1f; \
+ BOOKE_CLEAR_BTB(r11) \
/* if from user, start at top of this thread's kernel stack */ \
lwz r11, THREAD_INFO-THREAD(r10); \
ALLOC_STACK_FRAME(r11, THREAD_SIZE); \
@@ -127,6 +138,7 @@
stw r9,_CCR(r8); /* save CR on stack */\
mfspr r11,exc_level_srr1; /* check whether user or kernel */\
DO_KVM BOOKE_INTERRUPT_##intno exc_level_srr1; \
+ BOOKE_CLEAR_BTB(r10) \
andi. r11,r11,MSR_PR; \
mfspr r11,SPRN_SPRG_THREAD; /* if from user, start at top of */\
lwz r11,THREAD_INFO-THREAD(r11); /* this thread's kernel stack */\
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index fffd1f96bb1d..275769b6fb0d 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -451,6 +451,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
mfcr r13
stw r13, THREAD_NORMSAVE(3)(r10)
DO_KVM BOOKE_INTERRUPT_DTLB_MISS SPRN_SRR1
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
mfspr r10, SPRN_DEAR /* Get faulting address */

/* If we are faulting a kernel address, we have to use the
@@ -545,6 +552,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
mfcr r13
stw r13, THREAD_NORMSAVE(3)(r10)
DO_KVM BOOKE_INTERRUPT_ITLB_MISS SPRN_SRR1
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
+
mfspr r10, SPRN_SRR0 /* Get faulting address */

/* If we are faulting a kernel address, we have to use the
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 9547381b631a..ff009be97a42 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -67,7 +67,15 @@ int module_finalize(const Elf_Ehdr *hdr,
do_feature_fixups(powerpc_firmware_features,
(void *)sect->sh_addr,
(void *)sect->sh_addr + sect->sh_size);
-#endif
+#endif /* CONFIG_PPC64 */
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+ sect = find_section(hdr, sechdrs, "__spec_barrier_fixup");
+ if (sect != NULL)
+ do_barrier_nospec_fixups_range(barrier_nospec_enabled,
+ (void *)sect->sh_addr,
+ (void *)sect->sh_addr + sect->sh_size);
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */

sect = find_section(hdr, sechdrs, "__lwsync_fixup");
if (sect != NULL)
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
new file mode 100644
index 000000000000..fe30ddfd51ee
--- /dev/null
+++ b/arch/powerpc/kernel/security.c
@@ -0,0 +1,434 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// Security related flags and so on.
+//
+// Copyright 2018, Michael Ellerman, IBM Corporation.
+
+#include <linux/cpu.h>
+#include <linux/kernel.h>
+#include <linux/debugfs.h>
+#include <linux/device.h>
+#include <linux/seq_buf.h>
+
+#include <asm/debug.h>
+#include <asm/asm-prototypes.h>
+#include <asm/code-patching.h>
+#include <asm/security_features.h>
+#include <asm/setup.h>
+
+
+unsigned long powerpc_security_features __read_mostly = SEC_FTR_DEFAULT;
+
+enum count_cache_flush_type {
+ COUNT_CACHE_FLUSH_NONE = 0x1,
+ COUNT_CACHE_FLUSH_SW = 0x2,
+ COUNT_CACHE_FLUSH_HW = 0x4,
+};
+static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE;
+
+bool barrier_nospec_enabled;
+static bool no_nospec;
+static bool btb_flush_enabled;
+#ifdef CONFIG_PPC_FSL_BOOK3E
+static bool no_spectrev2;
+#endif
+
+static void enable_barrier_nospec(bool enable)
+{
+ barrier_nospec_enabled = enable;
+ do_barrier_nospec_fixups(enable);
+}
+
+void setup_barrier_nospec(void)
+{
+ bool enable;
+
+ /*
+ * It would make sense to check SEC_FTR_SPEC_BAR_ORI31 below as well.
+ * But there's a good reason not to. The two flags we check below are
+ * both are enabled by default in the kernel, so if the hcall is not
+ * functional they will be enabled.
+ * On a system where the host firmware has been updated (so the ori
+ * functions as a barrier), but on which the hypervisor (KVM/Qemu) has
+ * not been updated, we would like to enable the barrier. Dropping the
+ * check for SEC_FTR_SPEC_BAR_ORI31 achieves that. The only downside is
+ * we potentially enable the barrier on systems where the host firmware
+ * is not updated, but that's harmless as it's a no-op.
+ */
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
+ security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR);
+
+ if (!no_nospec)
+ enable_barrier_nospec(enable);
+}
+
+static int __init handle_nospectre_v1(char *p)
+{
+ no_nospec = true;
+
+ return 0;
+}
+early_param("nospectre_v1", handle_nospectre_v1);
+
+#ifdef CONFIG_DEBUG_FS
+static int barrier_nospec_set(void *data, u64 val)
+{
+ switch (val) {
+ case 0:
+ case 1:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!!val == !!barrier_nospec_enabled)
+ return 0;
+
+ enable_barrier_nospec(!!val);
+
+ return 0;
+}
+
+static int barrier_nospec_get(void *data, u64 *val)
+{
+ *val = barrier_nospec_enabled ? 1 : 0;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec,
+ barrier_nospec_get, barrier_nospec_set, "%llu\n");
+
+static __init int barrier_nospec_debugfs_init(void)
+{
+ debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL,
+ &fops_barrier_nospec);
+ return 0;
+}
+device_initcall(barrier_nospec_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+static int __init handle_nospectre_v2(char *p)
+{
+ no_spectrev2 = true;
+
+ return 0;
+}
+early_param("nospectre_v2", handle_nospectre_v2);
+void setup_spectre_v2(void)
+{
+ if (no_spectrev2)
+ do_btb_flush_fixups();
+ else
+ btb_flush_enabled = true;
+}
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
+#ifdef CONFIG_PPC_BOOK3S_64
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ bool thread_priv;
+
+ thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
+
+ if (rfi_flush || thread_priv) {
+ struct seq_buf s;
+ seq_buf_init(&s, buf, PAGE_SIZE - 1);
+
+ seq_buf_printf(&s, "Mitigation: ");
+
+ if (rfi_flush)
+ seq_buf_printf(&s, "RFI Flush");
+
+ if (rfi_flush && thread_priv)
+ seq_buf_printf(&s, ", ");
+
+ if (thread_priv)
+ seq_buf_printf(&s, "L1D private per thread");
+
+ seq_buf_printf(&s, "\n");
+
+ return s.len;
+ }
+
+ if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+ !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
+ return sprintf(buf, "Not affected\n");
+
+ return sprintf(buf, "Vulnerable\n");
+}
+#endif
+
+ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct seq_buf s;
+
+ seq_buf_init(&s, buf, PAGE_SIZE - 1);
+
+ if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
+ if (barrier_nospec_enabled)
+ seq_buf_printf(&s, "Mitigation: __user pointer sanitization");
+ else
+ seq_buf_printf(&s, "Vulnerable");
+
+ if (security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31))
+ seq_buf_printf(&s, ", ori31 speculation barrier enabled");
+
+ seq_buf_printf(&s, "\n");
+ } else
+ seq_buf_printf(&s, "Not affected\n");
+
+ return s.len;
+}
+
+ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct seq_buf s;
+ bool bcs, ccd;
+
+ seq_buf_init(&s, buf, PAGE_SIZE - 1);
+
+ bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED);
+ ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED);
+
+ if (bcs || ccd) {
+ seq_buf_printf(&s, "Mitigation: ");
+
+ if (bcs)
+ seq_buf_printf(&s, "Indirect branch serialisation (kernel only)");
+
+ if (bcs && ccd)
+ seq_buf_printf(&s, ", ");
+
+ if (ccd)
+ seq_buf_printf(&s, "Indirect branch cache disabled");
+ } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) {
+ seq_buf_printf(&s, "Mitigation: Software count cache flush");
+
+ if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW)
+ seq_buf_printf(&s, " (hardware accelerated)");
+ } else if (btb_flush_enabled) {
+ seq_buf_printf(&s, "Mitigation: Branch predictor state flush");
+ } else {
+ seq_buf_printf(&s, "Vulnerable");
+ }
+
+ seq_buf_printf(&s, "\n");
+
+ return s.len;
+}
+
+#ifdef CONFIG_PPC_BOOK3S_64
+/*
+ * Store-forwarding barrier support.
+ */
+
+static enum stf_barrier_type stf_enabled_flush_types;
+static bool no_stf_barrier;
+bool stf_barrier;
+
+static int __init handle_no_stf_barrier(char *p)
+{
+ pr_info("stf-barrier: disabled on command line.");
+ no_stf_barrier = true;
+ return 0;
+}
+
+early_param("no_stf_barrier", handle_no_stf_barrier);
+
+/* This is the generic flag used by other architectures */
+static int __init handle_ssbd(char *p)
+{
+ if (!p || strncmp(p, "auto", 5) == 0 || strncmp(p, "on", 2) == 0 ) {
+ /* Until firmware tells us, we have the barrier with auto */
+ return 0;
+ } else if (strncmp(p, "off", 3) == 0) {
+ handle_no_stf_barrier(NULL);
+ return 0;
+ } else
+ return 1;
+
+ return 0;
+}
+early_param("spec_store_bypass_disable", handle_ssbd);
+
+/* This is the generic flag used by other architectures */
+static int __init handle_no_ssbd(char *p)
+{
+ handle_no_stf_barrier(NULL);
+ return 0;
+}
+early_param("nospec_store_bypass_disable", handle_no_ssbd);
+
+static void stf_barrier_enable(bool enable)
+{
+ if (enable)
+ do_stf_barrier_fixups(stf_enabled_flush_types);
+ else
+ do_stf_barrier_fixups(STF_BARRIER_NONE);
+
+ stf_barrier = enable;
+}
+
+void setup_stf_barrier(void)
+{
+ enum stf_barrier_type type;
+ bool enable, hv;
+
+ hv = cpu_has_feature(CPU_FTR_HVMODE);
+
+ /* Default to fallback in case fw-features are not available */
+ if (cpu_has_feature(CPU_FTR_ARCH_207S))
+ type = STF_BARRIER_SYNC_ORI;
+ else if (cpu_has_feature(CPU_FTR_ARCH_206))
+ type = STF_BARRIER_FALLBACK;
+ else
+ type = STF_BARRIER_NONE;
+
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
+ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) ||
+ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && hv));
+
+ if (type == STF_BARRIER_FALLBACK) {
+ pr_info("stf-barrier: fallback barrier available\n");
+ } else if (type == STF_BARRIER_SYNC_ORI) {
+ pr_info("stf-barrier: hwsync barrier available\n");
+ } else if (type == STF_BARRIER_EIEIO) {
+ pr_info("stf-barrier: eieio barrier available\n");
+ }
+
+ stf_enabled_flush_types = type;
+
+ if (!no_stf_barrier)
+ stf_barrier_enable(enable);
+}
+
+ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ if (stf_barrier && stf_enabled_flush_types != STF_BARRIER_NONE) {
+ const char *type;
+ switch (stf_enabled_flush_types) {
+ case STF_BARRIER_EIEIO:
+ type = "eieio";
+ break;
+ case STF_BARRIER_SYNC_ORI:
+ type = "hwsync";
+ break;
+ case STF_BARRIER_FALLBACK:
+ type = "fallback";
+ break;
+ default:
+ type = "unknown";
+ }
+ return sprintf(buf, "Mitigation: Kernel entry/exit barrier (%s)\n", type);
+ }
+
+ if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+ !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
+ return sprintf(buf, "Not affected\n");
+
+ return sprintf(buf, "Vulnerable\n");
+}
+
+#ifdef CONFIG_DEBUG_FS
+static int stf_barrier_set(void *data, u64 val)
+{
+ bool enable;
+
+ if (val == 1)
+ enable = true;
+ else if (val == 0)
+ enable = false;
+ else
+ return -EINVAL;
+
+ /* Only do anything if we're changing state */
+ if (enable != stf_barrier)
+ stf_barrier_enable(enable);
+
+ return 0;
+}
+
+static int stf_barrier_get(void *data, u64 *val)
+{
+ *val = stf_barrier ? 1 : 0;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_stf_barrier, stf_barrier_get, stf_barrier_set, "%llu\n");
+
+static __init int stf_barrier_debugfs_init(void)
+{
+ debugfs_create_file("stf_barrier", 0600, powerpc_debugfs_root, NULL, &fops_stf_barrier);
+ return 0;
+}
+device_initcall(stf_barrier_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+
+static void toggle_count_cache_flush(bool enable)
+{
+ if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) {
+ patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP);
+ count_cache_flush_type = COUNT_CACHE_FLUSH_NONE;
+ pr_info("count-cache-flush: software flush disabled.\n");
+ return;
+ }
+
+ patch_branch_site(&patch__call_flush_count_cache,
+ (u64)&flush_count_cache, BRANCH_SET_LINK);
+
+ if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) {
+ count_cache_flush_type = COUNT_CACHE_FLUSH_SW;
+ pr_info("count-cache-flush: full software flush sequence enabled.\n");
+ return;
+ }
+
+ patch_instruction_site(&patch__flush_count_cache_return, PPC_INST_BLR);
+ count_cache_flush_type = COUNT_CACHE_FLUSH_HW;
+ pr_info("count-cache-flush: hardware assisted flush sequence enabled\n");
+}
+
+void setup_count_cache_flush(void)
+{
+ toggle_count_cache_flush(true);
+}
+
+#ifdef CONFIG_DEBUG_FS
+static int count_cache_flush_set(void *data, u64 val)
+{
+ bool enable;
+
+ if (val == 1)
+ enable = true;
+ else if (val == 0)
+ enable = false;
+ else
+ return -EINVAL;
+
+ toggle_count_cache_flush(enable);
+
+ return 0;
+}
+
+static int count_cache_flush_get(void *data, u64 *val)
+{
+ if (count_cache_flush_type == COUNT_CACHE_FLUSH_NONE)
+ *val = 0;
+ else
+ *val = 1;
+
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get,
+ count_cache_flush_set, "%llu\n");
+
+static __init int count_cache_flush_debugfs_init(void)
+{
+ debugfs_create_file("count_cache_flush", 0600, powerpc_debugfs_root,
+ NULL, &fops_count_cache_flush);
+ return 0;
+}
+device_initcall(count_cache_flush_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index ad8c9db61237..cb37f27bb928 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -322,6 +322,9 @@ void __init setup_arch(char **cmdline_p)
ppc_md.setup_arch();
if ( ppc_md.progress ) ppc_md.progress("arch: exit", 0x3eab);

+ setup_barrier_nospec();
+ setup_spectre_v2();
+
paging_init();

/* Initialize the MMU context management stuff */
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 9eb469bed22b..11590f6cb2f9 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -736,6 +736,9 @@ void __init setup_arch(char **cmdline_p)
if (ppc_md.setup_arch)
ppc_md.setup_arch();

+ setup_barrier_nospec();
+ setup_spectre_v2();
+
paging_init();

/* Initialize the MMU context management stuff */
@@ -873,9 +876,6 @@ static void do_nothing(void *unused)

void rfi_flush_enable(bool enable)
{
- if (rfi_flush == enable)
- return;
-
if (enable) {
do_rfi_flush_fixups(enabled_flush_types);
on_each_cpu(do_nothing, NULL, 1);
@@ -885,11 +885,15 @@ void rfi_flush_enable(bool enable)
rfi_flush = enable;
}

-static void init_fallback_flush(void)
+static void __ref init_fallback_flush(void)
{
u64 l1d_size, limit;
int cpu;

+ /* Only allocate the fallback flush area once (at boot time). */
+ if (l1d_flush_fallback_area)
+ return;
+
l1d_size = ppc64_caches.dsize;
limit = min(safe_stack_limit(), ppc64_rma_size);

@@ -902,34 +906,23 @@ static void init_fallback_flush(void)
memset(l1d_flush_fallback_area, 0, l1d_size * 2);

for_each_possible_cpu(cpu) {
- /*
- * The fallback flush is currently coded for 8-way
- * associativity. Different associativity is possible, but it
- * will be treated as 8-way and may not evict the lines as
- * effectively.
- *
- * 128 byte lines are mandatory.
- */
- u64 c = l1d_size / 8;
-
paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area;
- paca[cpu].l1d_flush_congruence = c;
- paca[cpu].l1d_flush_sets = c / 128;
+ paca[cpu].l1d_flush_size = l1d_size;
}
}

-void __init setup_rfi_flush(enum l1d_flush_type types, bool enable)
+void setup_rfi_flush(enum l1d_flush_type types, bool enable)
{
if (types & L1D_FLUSH_FALLBACK) {
- pr_info("rfi-flush: Using fallback displacement flush\n");
+ pr_info("rfi-flush: fallback displacement flush available\n");
init_fallback_flush();
}

if (types & L1D_FLUSH_ORI)
- pr_info("rfi-flush: Using ori type flush\n");
+ pr_info("rfi-flush: ori type flush available\n");

if (types & L1D_FLUSH_MTTRIG)
- pr_info("rfi-flush: Using mttrig type flush\n");
+ pr_info("rfi-flush: mttrig type flush available\n");

enabled_flush_types = types;

@@ -940,13 +933,19 @@ void __init setup_rfi_flush(enum l1d_flush_type types, bool enable)
#ifdef CONFIG_DEBUG_FS
static int rfi_flush_set(void *data, u64 val)
{
+ bool enable;
+
if (val == 1)
- rfi_flush_enable(true);
+ enable = true;
else if (val == 0)
- rfi_flush_enable(false);
+ enable = false;
else
return -EINVAL;

+ /* Only do anything if we're changing state */
+ if (enable != rfi_flush)
+ rfi_flush_enable(enable);
+
return 0;
}

@@ -965,12 +964,4 @@ static __init int rfi_flush_debugfs_init(void)
}
device_initcall(rfi_flush_debugfs_init);
#endif
-
-ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
-{
- if (rfi_flush)
- return sprintf(buf, "Mitigation: RFI Flush\n");
-
- return sprintf(buf, "Vulnerable\n");
-}
#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 072a23a17350..876ac9d52afc 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -73,14 +73,45 @@ SECTIONS
RODATA

#ifdef CONFIG_PPC64
+ . = ALIGN(8);
+ __stf_entry_barrier_fixup : AT(ADDR(__stf_entry_barrier_fixup) - LOAD_OFFSET) {
+ __start___stf_entry_barrier_fixup = .;
+ *(__stf_entry_barrier_fixup)
+ __stop___stf_entry_barrier_fixup = .;
+ }
+
+ . = ALIGN(8);
+ __stf_exit_barrier_fixup : AT(ADDR(__stf_exit_barrier_fixup) - LOAD_OFFSET) {
+ __start___stf_exit_barrier_fixup = .;
+ *(__stf_exit_barrier_fixup)
+ __stop___stf_exit_barrier_fixup = .;
+ }
+
. = ALIGN(8);
__rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) {
__start___rfi_flush_fixup = .;
*(__rfi_flush_fixup)
__stop___rfi_flush_fixup = .;
}
-#endif
+#endif /* CONFIG_PPC64 */

+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+ . = ALIGN(8);
+ __spec_barrier_fixup : AT(ADDR(__spec_barrier_fixup) - LOAD_OFFSET) {
+ __start___barrier_nospec_fixup = .;
+ *(__barrier_nospec_fixup)
+ __stop___barrier_nospec_fixup = .;
+ }
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+ . = ALIGN(8);
+ __spec_btb_flush_fixup : AT(ADDR(__spec_btb_flush_fixup) - LOAD_OFFSET) {
+ __start__btb_flush_fixup = .;
+ *(__btb_flush_fixup)
+ __stop__btb_flush_fixup = .;
+ }
+#endif
EXCEPTION_TABLE(0)

NOTES :kernel :notes
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 81bd8a07aa51..612b7f6a887f 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -75,6 +75,10 @@
PPC_LL r1, VCPU_HOST_STACK(r4)
PPC_LL r2, HOST_R2(r1)

+START_BTB_FLUSH_SECTION
+ BTB_FLUSH(r10)
+END_BTB_FLUSH_SECTION
+
mfspr r10, SPRN_PID
lwz r8, VCPU_HOST_PID(r4)
PPC_LL r11, VCPU_SHARED(r4)
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 990db69a1d0b..fa88f641ac03 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -277,6 +277,13 @@ int kvmppc_core_emulate_mtspr_e500(struct kvm_vcpu *vcpu, int sprn, ulong spr_va
vcpu->arch.pwrmgtcr0 = spr_val;
break;

+ case SPRN_BUCSR:
+ /*
+ * If we are here, it means that we have already flushed the
+ * branch predictor, so just return to guest.
+ */
+ break;
+
/* extra exceptions */
#ifdef CONFIG_SPE_POSSIBLE
case SPRN_IVOR32:
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index d5edbeb8eb82..31d31a10f71f 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -14,12 +14,25 @@
#include <asm/page.h>
#include <asm/code-patching.h>
#include <asm/uaccess.h>
+#include <asm/setup.h>
+#include <asm/sections.h>


+static inline bool is_init(unsigned int *addr)
+{
+ return addr >= (unsigned int *)__init_begin && addr < (unsigned int *)__init_end;
+}
+
int patch_instruction(unsigned int *addr, unsigned int instr)
{
int err;

+ /* Make sure we aren't patching a freed init section */
+ if (*PTRRELOC(&init_mem_is_free) && is_init(addr)) {
+ pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+ return 0;
+ }
+
__put_user_size(instr, addr, 4, err);
if (err)
return err;
@@ -32,6 +45,22 @@ int patch_branch(unsigned int *addr, unsigned long target, int flags)
return patch_instruction(addr, create_branch(addr, target, flags));
}

+int patch_branch_site(s32 *site, unsigned long target, int flags)
+{
+ unsigned int *addr;
+
+ addr = (unsigned int *)((unsigned long)site + *site);
+ return patch_instruction(addr, create_branch(addr, target, flags));
+}
+
+int patch_instruction_site(s32 *site, unsigned int instr)
+{
+ unsigned int *addr;
+
+ addr = (unsigned int *)((unsigned long)site + *site);
+ return patch_instruction(addr, instr);
+}
+
unsigned int create_branch(const unsigned int *addr,
unsigned long target, int flags)
{
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 3af014684872..7bdfc19a491d 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -21,7 +21,7 @@
#include <asm/page.h>
#include <asm/sections.h>
#include <asm/setup.h>
-
+#include <asm/security_features.h>

struct fixup_entry {
unsigned long mask;
@@ -115,6 +115,120 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
}

#ifdef CONFIG_PPC_BOOK3S_64
+void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
+{
+ unsigned int instrs[3], *dest;
+ long *start, *end;
+ int i;
+
+ start = PTRRELOC(&__start___stf_entry_barrier_fixup),
+ end = PTRRELOC(&__stop___stf_entry_barrier_fixup);
+
+ instrs[0] = 0x60000000; /* nop */
+ instrs[1] = 0x60000000; /* nop */
+ instrs[2] = 0x60000000; /* nop */
+
+ i = 0;
+ if (types & STF_BARRIER_FALLBACK) {
+ instrs[i++] = 0x7d4802a6; /* mflr r10 */
+ instrs[i++] = 0x60000000; /* branch patched below */
+ instrs[i++] = 0x7d4803a6; /* mtlr r10 */
+ } else if (types & STF_BARRIER_EIEIO) {
+ instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */
+ } else if (types & STF_BARRIER_SYNC_ORI) {
+ instrs[i++] = 0x7c0004ac; /* hwsync */
+ instrs[i++] = 0xe94d0000; /* ld r10,0(r13) */
+ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+ patch_instruction(dest, instrs[0]);
+
+ if (types & STF_BARRIER_FALLBACK)
+ patch_branch(dest + 1, (unsigned long)&stf_barrier_fallback,
+ BRANCH_SET_LINK);
+ else
+ patch_instruction(dest + 1, instrs[1]);
+
+ patch_instruction(dest + 2, instrs[2]);
+ }
+
+ printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
+ (types == STF_BARRIER_NONE) ? "no" :
+ (types == STF_BARRIER_FALLBACK) ? "fallback" :
+ (types == STF_BARRIER_EIEIO) ? "eieio" :
+ (types == (STF_BARRIER_SYNC_ORI)) ? "hwsync"
+ : "unknown");
+}
+
+void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
+{
+ unsigned int instrs[6], *dest;
+ long *start, *end;
+ int i;
+
+ start = PTRRELOC(&__start___stf_exit_barrier_fixup),
+ end = PTRRELOC(&__stop___stf_exit_barrier_fixup);
+
+ instrs[0] = 0x60000000; /* nop */
+ instrs[1] = 0x60000000; /* nop */
+ instrs[2] = 0x60000000; /* nop */
+ instrs[3] = 0x60000000; /* nop */
+ instrs[4] = 0x60000000; /* nop */
+ instrs[5] = 0x60000000; /* nop */
+
+ i = 0;
+ if (types & STF_BARRIER_FALLBACK || types & STF_BARRIER_SYNC_ORI) {
+ if (cpu_has_feature(CPU_FTR_HVMODE)) {
+ instrs[i++] = 0x7db14ba6; /* mtspr 0x131, r13 (HSPRG1) */
+ instrs[i++] = 0x7db04aa6; /* mfspr r13, 0x130 (HSPRG0) */
+ } else {
+ instrs[i++] = 0x7db243a6; /* mtsprg 2,r13 */
+ instrs[i++] = 0x7db142a6; /* mfsprg r13,1 */
+ }
+ instrs[i++] = 0x7c0004ac; /* hwsync */
+ instrs[i++] = 0xe9ad0000; /* ld r13,0(r13) */
+ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ if (cpu_has_feature(CPU_FTR_HVMODE)) {
+ instrs[i++] = 0x7db14aa6; /* mfspr r13, 0x131 (HSPRG1) */
+ } else {
+ instrs[i++] = 0x7db242a6; /* mfsprg r13,2 */
+ }
+ } else if (types & STF_BARRIER_EIEIO) {
+ instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+ patch_instruction(dest, instrs[0]);
+ patch_instruction(dest + 1, instrs[1]);
+ patch_instruction(dest + 2, instrs[2]);
+ patch_instruction(dest + 3, instrs[3]);
+ patch_instruction(dest + 4, instrs[4]);
+ patch_instruction(dest + 5, instrs[5]);
+ }
+ printk(KERN_DEBUG "stf-barrier: patched %d exit locations (%s barrier)\n", i,
+ (types == STF_BARRIER_NONE) ? "no" :
+ (types == STF_BARRIER_FALLBACK) ? "fallback" :
+ (types == STF_BARRIER_EIEIO) ? "eieio" :
+ (types == (STF_BARRIER_SYNC_ORI)) ? "hwsync"
+ : "unknown");
+}
+
+
+void do_stf_barrier_fixups(enum stf_barrier_type types)
+{
+ do_stf_entry_barrier_fixups(types);
+ do_stf_exit_barrier_fixups(types);
+}
+
void do_rfi_flush_fixups(enum l1d_flush_type types)
{
unsigned int instrs[3], *dest;
@@ -151,10 +265,110 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
patch_instruction(dest + 2, instrs[2]);
}

- printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i);
+ printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i,
+ (types == L1D_FLUSH_NONE) ? "no" :
+ (types == L1D_FLUSH_FALLBACK) ? "fallback displacement" :
+ (types & L1D_FLUSH_ORI) ? (types & L1D_FLUSH_MTTRIG)
+ ? "ori+mttrig type"
+ : "ori type" :
+ (types & L1D_FLUSH_MTTRIG) ? "mttrig type"
+ : "unknown");
+}
+
+void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
+{
+ unsigned int instr, *dest;
+ long *start, *end;
+ int i;
+
+ start = fixup_start;
+ end = fixup_end;
+
+ instr = 0x60000000; /* nop */
+
+ if (enable) {
+ pr_info("barrier-nospec: using ORI speculation barrier\n");
+ instr = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+ patch_instruction(dest, instr);
+ }
+
+ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
}
+
#endif /* CONFIG_PPC_BOOK3S_64 */

+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void do_barrier_nospec_fixups(bool enable)
+{
+ void *start, *end;
+
+ start = PTRRELOC(&__start___barrier_nospec_fixup),
+ end = PTRRELOC(&__stop___barrier_nospec_fixup);
+
+ do_barrier_nospec_fixups_range(enable, start, end);
+}
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
+{
+ unsigned int instr[2], *dest;
+ long *start, *end;
+ int i;
+
+ start = fixup_start;
+ end = fixup_end;
+
+ instr[0] = PPC_INST_NOP;
+ instr[1] = PPC_INST_NOP;
+
+ if (enable) {
+ pr_info("barrier-nospec: using isync; sync as speculation barrier\n");
+ instr[0] = PPC_INST_ISYNC;
+ instr[1] = PPC_INST_SYNC;
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+ patch_instruction(dest, instr[0]);
+ patch_instruction(dest + 1, instr[1]);
+ }
+
+ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
+}
+
+static void patch_btb_flush_section(long *curr)
+{
+ unsigned int *start, *end;
+
+ start = (void *)curr + *curr;
+ end = (void *)curr + *(curr + 1);
+ for (; start < end; start++) {
+ pr_devel("patching dest %lx\n", (unsigned long)start);
+ patch_instruction(start, PPC_INST_NOP);
+ }
+}
+
+void do_btb_flush_fixups(void)
+{
+ long *start, *end;
+
+ start = PTRRELOC(&__start__btb_flush_fixup);
+ end = PTRRELOC(&__stop__btb_flush_fixup);
+
+ for (; start < end; start += 2)
+ patch_btb_flush_section(start);
+}
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
{
long *start, *end;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 22d94c3e6fc4..1efe5ca5c3bc 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -62,6 +62,7 @@
#endif

unsigned long long memory_limit;
+bool init_mem_is_free;

#ifdef CONFIG_HIGHMEM
pte_t *kmap_pte;
@@ -381,6 +382,7 @@ void __init mem_init(void)
void free_initmem(void)
{
ppc_md.progress = ppc_printk_progress;
+ init_mem_is_free = true;
free_initmem_default(POISON_FREE_INITMEM);
}

diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index 29d6987c37ba..5486d56da289 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -69,6 +69,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
std r15,EX_TLB_R15(r12)
std r10,EX_TLB_CR(r12)
#ifdef CONFIG_PPC_FSL_BOOK3E
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
std r7,EX_TLB_R7(r12)
#endif
TLB_MISS_PROLOG_STATS
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index c57afc619b20..e14b52c7ebd8 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -37,53 +37,99 @@
#include <asm/smp.h>
#include <asm/tm.h>
#include <asm/setup.h>
+#include <asm/security_features.h>

#include "powernv.h"

+
+static bool fw_feature_is(const char *state, const char *name,
+ struct device_node *fw_features)
+{
+ struct device_node *np;
+ bool rc = false;
+
+ np = of_get_child_by_name(fw_features, name);
+ if (np) {
+ rc = of_property_read_bool(np, state);
+ of_node_put(np);
+ }
+
+ return rc;
+}
+
+static void init_fw_feat_flags(struct device_node *np)
+{
+ if (fw_feature_is("enabled", "inst-spec-barrier-ori31,31,0", np))
+ security_ftr_set(SEC_FTR_SPEC_BAR_ORI31);
+
+ if (fw_feature_is("enabled", "fw-bcctrl-serialized", np))
+ security_ftr_set(SEC_FTR_BCCTRL_SERIALISED);
+
+ if (fw_feature_is("enabled", "inst-l1d-flush-ori30,30,0", np))
+ security_ftr_set(SEC_FTR_L1D_FLUSH_ORI30);
+
+ if (fw_feature_is("enabled", "inst-l1d-flush-trig2", np))
+ security_ftr_set(SEC_FTR_L1D_FLUSH_TRIG2);
+
+ if (fw_feature_is("enabled", "fw-l1d-thread-split", np))
+ security_ftr_set(SEC_FTR_L1D_THREAD_PRIV);
+
+ if (fw_feature_is("enabled", "fw-count-cache-disabled", np))
+ security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED);
+
+ if (fw_feature_is("enabled", "fw-count-cache-flush-bcctr2,0,0", np))
+ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST);
+
+ if (fw_feature_is("enabled", "needs-count-cache-flush-on-context-switch", np))
+ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE);
+
+ /*
+ * The features below are enabled by default, so we instead look to see
+ * if firmware has *disabled* them, and clear them if so.
+ */
+ if (fw_feature_is("disabled", "speculation-policy-favor-security", np))
+ security_ftr_clear(SEC_FTR_FAVOUR_SECURITY);
+
+ if (fw_feature_is("disabled", "needs-l1d-flush-msr-pr-0-to-1", np))
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_PR);
+
+ if (fw_feature_is("disabled", "needs-l1d-flush-msr-hv-1-to-0", np))
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_HV);
+
+ if (fw_feature_is("disabled", "needs-spec-barrier-for-bound-checks", np))
+ security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR);
+}
+
static void pnv_setup_rfi_flush(void)
{
struct device_node *np, *fw_features;
enum l1d_flush_type type;
- int enable;
+ bool enable;

/* Default to fallback in case fw-features are not available */
type = L1D_FLUSH_FALLBACK;
- enable = 1;

np = of_find_node_by_name(NULL, "ibm,opal");
fw_features = of_get_child_by_name(np, "fw-features");
of_node_put(np);

if (fw_features) {
- np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2");
- if (np && of_property_read_bool(np, "enabled"))
- type = L1D_FLUSH_MTTRIG;
+ init_fw_feat_flags(fw_features);
+ of_node_put(fw_features);

- of_node_put(np);
+ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_TRIG2))
+ type = L1D_FLUSH_MTTRIG;

- np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0");
- if (np && of_property_read_bool(np, "enabled"))
+ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_ORI30))
type = L1D_FLUSH_ORI;
-
- of_node_put(np);
-
- /* Enable unless firmware says NOT to */
- enable = 2;
- np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0");
- if (np && of_property_read_bool(np, "disabled"))
- enable--;
-
- of_node_put(np);
-
- np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1");
- if (np && of_property_read_bool(np, "disabled"))
- enable--;
-
- of_node_put(np);
- of_node_put(fw_features);
}

- setup_rfi_flush(type, enable > 0);
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
+ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \
+ security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV));
+
+ setup_rfi_flush(type, enable);
+ setup_count_cache_flush();
}

static void __init pnv_setup_arch(void)
@@ -91,6 +137,7 @@ static void __init pnv_setup_arch(void)
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);

pnv_setup_rfi_flush();
+ setup_stf_barrier();

/* Initialize SMP */
pnv_smp_init();
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index 8dd0c8edefd6..c773396d0969 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -314,6 +314,9 @@ void post_mobility_fixup(void)
printk(KERN_ERR "Post-mobility device tree update "
"failed: %d\n", rc);

+ /* Possibly switch to a new RFI flush type */
+ pseries_setup_rfi_flush();
+
return;
}

diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 8411c27293e4..e7d80797384d 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -81,4 +81,6 @@ extern struct pci_controller_ops pseries_pci_controller_ops;

unsigned long pseries_memory_block_size(void);

+void pseries_setup_rfi_flush(void);
+
#endif /* _PSERIES_PSERIES_H */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index dd2545fc9947..9cc976ff7fec 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -67,6 +67,7 @@
#include <asm/eeh.h>
#include <asm/reg.h>
#include <asm/plpar_wrappers.h>
+#include <asm/security_features.h>

#include "pseries.h"

@@ -499,37 +500,87 @@ static void __init find_and_init_phbs(void)
of_pci_check_probe_only();
}

-static void pseries_setup_rfi_flush(void)
+static void init_cpu_char_feature_flags(struct h_cpu_char_result *result)
+{
+ /*
+ * The features below are disabled by default, so we instead look to see
+ * if firmware has *enabled* them, and set them if so.
+ */
+ if (result->character & H_CPU_CHAR_SPEC_BAR_ORI31)
+ security_ftr_set(SEC_FTR_SPEC_BAR_ORI31);
+
+ if (result->character & H_CPU_CHAR_BCCTRL_SERIALISED)
+ security_ftr_set(SEC_FTR_BCCTRL_SERIALISED);
+
+ if (result->character & H_CPU_CHAR_L1D_FLUSH_ORI30)
+ security_ftr_set(SEC_FTR_L1D_FLUSH_ORI30);
+
+ if (result->character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
+ security_ftr_set(SEC_FTR_L1D_FLUSH_TRIG2);
+
+ if (result->character & H_CPU_CHAR_L1D_THREAD_PRIV)
+ security_ftr_set(SEC_FTR_L1D_THREAD_PRIV);
+
+ if (result->character & H_CPU_CHAR_COUNT_CACHE_DISABLED)
+ security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED);
+
+ if (result->character & H_CPU_CHAR_BCCTR_FLUSH_ASSIST)
+ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST);
+
+ if (result->behaviour & H_CPU_BEHAV_FLUSH_COUNT_CACHE)
+ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE);
+
+ /*
+ * The features below are enabled by default, so we instead look to see
+ * if firmware has *disabled* them, and clear them if so.
+ */
+ if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY))
+ security_ftr_clear(SEC_FTR_FAVOUR_SECURITY);
+
+ if (!(result->behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_PR);
+
+ if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR))
+ security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR);
+}
+
+void pseries_setup_rfi_flush(void)
{
struct h_cpu_char_result result;
enum l1d_flush_type types;
bool enable;
long rc;

- /* Enable by default */
- enable = true;
+ /*
+ * Set features to the defaults assumed by init_cpu_char_feature_flags()
+ * so it can set/clear again any features that might have changed after
+ * migration, and in case the hypercall fails and it is not even called.
+ */
+ powerpc_security_features = SEC_FTR_DEFAULT;

rc = plpar_get_cpu_characteristics(&result);
- if (rc == H_SUCCESS) {
- types = L1D_FLUSH_NONE;
+ if (rc == H_SUCCESS)
+ init_cpu_char_feature_flags(&result);

- if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
- types |= L1D_FLUSH_MTTRIG;
- if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
- types |= L1D_FLUSH_ORI;
+ /*
+ * We're the guest so this doesn't apply to us, clear it to simplify
+ * handling of it elsewhere.
+ */
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_HV);

- /* Use fallback if nothing set in hcall */
- if (types == L1D_FLUSH_NONE)
- types = L1D_FLUSH_FALLBACK;
+ types = L1D_FLUSH_FALLBACK;

- if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
- enable = false;
- } else {
- /* Default to fallback if case hcall is not available */
- types = L1D_FLUSH_FALLBACK;
- }
+ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_TRIG2))
+ types |= L1D_FLUSH_MTTRIG;
+
+ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_ORI30))
+ types |= L1D_FLUSH_ORI;
+
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
+ security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR);

setup_rfi_flush(types, enable);
+ setup_count_cache_flush();
}

static void __init pSeries_setup_arch(void)
@@ -549,6 +600,7 @@ static void __init pSeries_setup_arch(void)
fwnmi_init();

pseries_setup_rfi_flush();
+ setup_stf_barrier();

/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 786bf01691c9..83619ebede93 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2144,6 +2144,8 @@ static void dump_one_paca(int cpu)
DUMP(p, slb_cache_ptr, "x");
for (i = 0; i < SLB_CACHE_ENTRIES; i++)
printf(" slb_cache[%d]: = 0x%016lx\n", i, p->slb_cache[i]);
+
+ DUMP(p, rfi_flush_fallback_area, "px");
#endif
DUMP(p, dscr_default, "llx");
#ifdef CONFIG_PPC_BOOK3E
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4598d087dec2..4d1262cf630c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -893,13 +893,7 @@ config NR_CPUS
approximately eight kilobytes to the kernel image.

config SCHED_SMT
- bool "SMT (Hyperthreading) scheduler support"
- depends on SMP
- ---help---
- SMT scheduler support improves the CPU scheduler's decision making
- when dealing with Intel Pentium 4 chips with HyperThreading at a
- cost of slightly increased overhead in some places. If unsure say
- N here.
+ def_bool y if SMP

config SCHED_MC
def_bool y
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 071582a3b5c0..57be07f27f37 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -28,6 +28,7 @@
#include <asm/vdso.h>
#include <asm/uaccess.h>
#include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>

#define CREATE_TRACE_POINTS
#include <trace/events/syscalls.h>
@@ -295,6 +296,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
#endif

user_enter();
+
+ mds_user_clear_cpu_buffers();
}

#define SYSCALL_EXIT_WORK_FLAGS \
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 297dda4d5947..15ed35f5097d 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -159,7 +159,8 @@ quiet_cmd_vdso = VDSO $@
sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@'

VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \
- $(call ld-option, --build-id) -Bsymbolic
+ $(call ld-option, --build-id) $(call ld-option, --eh-frame-hdr) \
+ -Bsymbolic
GCOV_PROFILE := n

#
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index a5fa3195a230..d9f7d1770e98 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -214,6 +214,7 @@
#define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */
#define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* "" L1TF workaround PTE inversion */
+#define X86_FEATURE_IBRS_ENHANCED ( 7*32+30) /* Enhanced IBRS */

/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
@@ -265,10 +266,12 @@

/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
-#define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */
-#define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */
-#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_AMD_IBPB (13*32+12) /* "" Indirect Branch Prediction Barrier */
+#define X86_FEATURE_AMD_IBRS (13*32+14) /* "" Indirect Branch Restricted Speculation */
+#define X86_FEATURE_AMD_STIBP (13*32+15) /* "" Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store Bypass Disable */
#define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */
+#define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */

/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
@@ -307,6 +310,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */
@@ -332,5 +336,7 @@
#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
#define X86_BUG_SPEC_STORE_BYPASS X86_BUG(17) /* CPU is affected by speculative store bypass attack */
#define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
+#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
+#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */

#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index e13ff5a14633..6801f958e254 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -50,19 +50,23 @@

/* "Small Core" Processors (Atom) */

-#define INTEL_FAM6_ATOM_PINEVIEW 0x1C
-#define INTEL_FAM6_ATOM_LINCROFT 0x26
-#define INTEL_FAM6_ATOM_PENWELL 0x27
-#define INTEL_FAM6_ATOM_CLOVERVIEW 0x35
-#define INTEL_FAM6_ATOM_CEDARVIEW 0x36
-#define INTEL_FAM6_ATOM_SILVERMONT1 0x37 /* BayTrail/BYT / Valleyview */
-#define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */
-#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */
-#define INTEL_FAM6_ATOM_MERRIFIELD 0x4A /* Tangier */
-#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Annidale */
-#define INTEL_FAM6_ATOM_GOLDMONT 0x5C
-#define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */
-#define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A
+#define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */
+#define INTEL_FAM6_ATOM_BONNELL_MID 0x26 /* Silverthorne, Lincroft */
+
+#define INTEL_FAM6_ATOM_SALTWELL 0x36 /* Cedarview */
+#define INTEL_FAM6_ATOM_SALTWELL_MID 0x27 /* Penwell */
+#define INTEL_FAM6_ATOM_SALTWELL_TABLET 0x35 /* Cloverview */
+
+#define INTEL_FAM6_ATOM_SILVERMONT 0x37 /* Bay Trail, Valleyview */
+#define INTEL_FAM6_ATOM_SILVERMONT_X 0x4D /* Avaton, Rangely */
+#define INTEL_FAM6_ATOM_SILVERMONT_MID 0x4A /* Merriefield */
+
+#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* Cherry Trail, Braswell */
+#define INTEL_FAM6_ATOM_AIRMONT_MID 0x5A /* Moorefield */
+
+#define INTEL_FAM6_ATOM_GOLDMONT 0x5C /* Apollo Lake */
+#define INTEL_FAM6_ATOM_GOLDMONT_X 0x5F /* Denverton */
+#define INTEL_FAM6_ATOM_GOLDMONT_PLUS 0x7A /* Gemini Lake */

/* Xeon Phi */

diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 8afbdcd3032b..46d8b99a0ff1 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -4,6 +4,9 @@
#include <asm/processor-flags.h>

#ifndef __ASSEMBLY__
+
+#include <asm/nospec-branch.h>
+
/*
* Interrupt control:
*/
@@ -49,11 +52,13 @@ static inline void native_irq_enable(void)

static inline void native_safe_halt(void)
{
+ mds_idle_clear_cpu_buffers();
asm volatile("sti; hlt": : :"memory");
}

static inline void native_halt(void)
{
+ mds_idle_clear_cpu_buffers();
asm volatile("hlt": : :"memory");
}

diff --git a/arch/x86/include/asm/microcode_intel.h b/arch/x86/include/asm/microcode_intel.h
index 8559b0102ea1..90343ba50485 100644
--- a/arch/x86/include/asm/microcode_intel.h
+++ b/arch/x86/include/asm/microcode_intel.h
@@ -53,6 +53,21 @@ struct extended_sigtable {

#define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)

+static inline u32 intel_get_microcode_revision(void)
+{
+ u32 rev, dummy;
+
+ native_wrmsrl(MSR_IA32_UCODE_REV, 0);
+
+ /* As documented in the SDM: Do a CPUID 1 here */
+ sync_core();
+
+ /* get the current revision from MSR 0x8B */
+ native_rdmsr(MSR_IA32_UCODE_REV, dummy, rev);
+
+ return rev;
+}
+
extern int has_newer_microcode(void *mc, unsigned int csig, int cpf, int rev);
extern int microcode_sanity_check(void *mc, int print_err);
extern int find_matching_signature(void *mc, unsigned int csig, int cpf);
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index caa00191e565..d4f5b8209393 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1,6 +1,8 @@
#ifndef _ASM_X86_MSR_INDEX_H
#define _ASM_X86_MSR_INDEX_H

+#include <linux/bits.h>
+
/* CPU model specific register (MSR) numbers */

/* x86-64 specific MSRs */
@@ -33,13 +35,14 @@

/* Intel MSRs. Some also available on other CPUs */
#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
-#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
-#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */
+#define SPEC_CTRL_IBRS BIT(0) /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_STIBP_SHIFT 1 /* Single Thread Indirect Branch Predictor (STIBP) bit */
+#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
-#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
+#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */

#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
-#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
+#define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */

#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
@@ -56,13 +59,18 @@
#define MSR_MTRRcap 0x000000fe

#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
-#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */
-#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */
-#define ARCH_CAP_SSB_NO (1 << 4) /*
- * Not susceptible to Speculative Store Bypass
- * attack, so no Speculative Store Bypass
- * control required.
- */
+#define ARCH_CAP_RDCL_NO BIT(0) /* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL BIT(1) /* Enhanced IBRS support */
+#define ARCH_CAP_SSB_NO BIT(4) /*
+ * Not susceptible to Speculative Store Bypass
+ * attack, so no Speculative Store Bypass
+ * control required.
+ */
+#define ARCH_CAP_MDS_NO BIT(5) /*
+ * Not susceptible to
+ * Microarchitectural Data
+ * Sampling (MDS) vulnerabilities.
+ */

#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 0deeb2d26df7..b98dbdaee8ac 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -4,6 +4,7 @@
#include <linux/sched.h>

#include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>

#define MWAIT_SUBSTATE_MASK 0xf
#define MWAIT_CSTATE_MASK 0xf
@@ -38,6 +39,8 @@ static inline void __monitorx(const void *eax, unsigned long ecx,

static inline void __mwait(unsigned long eax, unsigned long ecx)
{
+ mds_idle_clear_cpu_buffers();
+
/* "mwait %eax, %ecx;" */
asm volatile(".byte 0x0f, 0x01, 0xc9;"
:: "a" (eax), "c" (ecx));
@@ -72,6 +75,8 @@ static inline void __mwait(unsigned long eax, unsigned long ecx)
static inline void __mwaitx(unsigned long eax, unsigned long ebx,
unsigned long ecx)
{
+ /* No MDS buffer clear as this is AMD/HYGON only */
+
/* "mwaitx %eax, %ebx, %ecx;" */
asm volatile(".byte 0x0f, 0x01, 0xfb;"
:: "a" (eax), "b" (ebx), "c" (ecx));
@@ -79,6 +84,8 @@ static inline void __mwaitx(unsigned long eax, unsigned long ebx,

static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
{
+ mds_idle_clear_cpu_buffers();
+
trace_hardirqs_on();
/* "mwait %eax, %ecx;" */
asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index b4c74c24c890..e58c078f3d96 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -3,6 +3,8 @@
#ifndef _ASM_X86_NOSPEC_BRANCH_H_
#define _ASM_X86_NOSPEC_BRANCH_H_

+#include <linux/static_key.h>
+
#include <asm/alternative.h>
#include <asm/alternative-asm.h>
#include <asm/cpufeatures.h>
@@ -169,7 +171,15 @@ enum spectre_v2_mitigation {
SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
SPECTRE_V2_RETPOLINE_GENERIC,
SPECTRE_V2_RETPOLINE_AMD,
- SPECTRE_V2_IBRS,
+ SPECTRE_V2_IBRS_ENHANCED,
+};
+
+/* The indirect branch speculation control variants */
+enum spectre_v2_user_mitigation {
+ SPECTRE_V2_USER_NONE,
+ SPECTRE_V2_USER_STRICT,
+ SPECTRE_V2_USER_PRCTL,
+ SPECTRE_V2_USER_SECCOMP,
};

/* The Speculative Store Bypass disable variants */
@@ -248,6 +258,60 @@ do { \
preempt_enable(); \
} while (0)

+DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+
+DECLARE_STATIC_KEY_FALSE(mds_user_clear);
+DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
+
+#include <asm/segment.h>
+
+/**
+ * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * This uses the otherwise unused and obsolete VERW instruction in
+ * combination with microcode which triggers a CPU buffer flush when the
+ * instruction is executed.
+ */
+static inline void mds_clear_cpu_buffers(void)
+{
+ static const u16 ds = __KERNEL_DS;
+
+ /*
+ * Has to be the memory-operand variant because only that
+ * guarantees the CPU buffer flush functionality according to
+ * documentation. The register-operand variant does not.
+ * Works with any segment selector, but a valid writable
+ * data segment is the fastest variant.
+ *
+ * "cc" clobber is required because VERW modifies ZF.
+ */
+ asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
+}
+
+/**
+ * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * Clear CPU buffers if the corresponding static key is enabled
+ */
+static inline void mds_user_clear_cpu_buffers(void)
+{
+ if (static_branch_likely(&mds_user_clear))
+ mds_clear_cpu_buffers();
+}
+
+/**
+ * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * Clear CPU buffers if the corresponding static key is enabled
+ */
+static inline void mds_idle_clear_cpu_buffers(void)
+{
+ if (static_branch_likely(&mds_idle_clear))
+ mds_clear_cpu_buffers();
+}
+
#endif /* __ASSEMBLY__ */

/*
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 221a32ed1372..f12e61e2a86b 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -44,15 +44,15 @@ struct mm_struct;
void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte);


-static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep)
+static inline void native_set_pte(pte_t *ptep, pte_t pte)
{
- *ptep = native_make_pte(0);
+ WRITE_ONCE(*ptep, pte);
}

-static inline void native_set_pte(pte_t *ptep, pte_t pte)
+static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep)
{
- *ptep = pte;
+ native_set_pte(ptep, native_make_pte(0));
}

static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte)
@@ -62,7 +62,7 @@ static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte)

static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd)
{
- *pmdp = pmd;
+ WRITE_ONCE(*pmdp, pmd);
}

static inline void native_pmd_clear(pmd_t *pmd)
@@ -98,7 +98,7 @@ static inline pmd_t native_pmdp_get_and_clear(pmd_t *xp)

static inline void native_set_pud(pud_t *pudp, pud_t pud)
{
- *pudp = pud;
+ WRITE_ONCE(*pudp, pud);
}

static inline void native_pud_clear(pud_t *pud)
@@ -131,7 +131,7 @@ static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)

static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
{
- *pgdp = kaiser_set_shadow_pgd(pgdp, pgd);
+ WRITE_ONCE(*pgdp, kaiser_set_shadow_pgd(pgdp, pgd));
}

static inline void native_pgd_clear(pgd_t *pgd)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 440a948c4feb..dab73faef9b0 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -845,4 +845,11 @@ bool xen_set_default_idle(void);

void stop_this_cpu(void *dummy);
void df_debug(struct pt_regs *regs, long error_code);
+
+enum mds_mitigations {
+ MDS_MITIGATION_OFF,
+ MDS_MITIGATION_FULL,
+ MDS_MITIGATION_VMWERV,
+};
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index ae7c2c5cd7f0..5393babc0598 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -53,12 +53,24 @@ static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn)
return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
}

+static inline u64 stibp_tif_to_spec_ctrl(u64 tifn)
+{
+ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+ return (tifn & _TIF_SPEC_IB) >> (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl)
{
BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT);
return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
}

+static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl)
+{
+ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+ return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
{
return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
@@ -70,11 +82,7 @@ extern void speculative_store_bypass_ht_init(void);
static inline void speculative_store_bypass_ht_init(void) { }
#endif

-extern void speculative_store_bypass_update(unsigned long tif);
-
-static inline void speculative_store_bypass_update_current(void)
-{
- speculative_store_bypass_update(current_thread_info()->flags);
-}
+extern void speculation_ctrl_update(unsigned long tif);
+extern void speculation_ctrl_update_current(void);

#endif
diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index 025ecfaba9c9..4ff0878f4633 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -6,9 +6,6 @@
struct task_struct; /* one of the stranger aspects of C forward declarations */
__visible struct task_struct *__switch_to(struct task_struct *prev,
struct task_struct *next);
-struct tss_struct;
-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
- struct tss_struct *tss);

#ifdef CONFIG_X86_32

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index a96e88b243ef..e522b15fa3f0 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -92,10 +92,12 @@ struct thread_info {
#define TIF_SIGPENDING 2 /* signal pending */
#define TIF_NEED_RESCHED 3 /* rescheduling necessary */
#define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/
-#define TIF_SSBD 5 /* Reduced data speculation */
+#define TIF_SSBD 5 /* Speculative store bypass disable */
#define TIF_SYSCALL_EMU 6 /* syscall emulation active */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SECCOMP 8 /* secure computing */
+#define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */
+#define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
@@ -121,6 +123,8 @@ struct thread_info {
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
+#define _TIF_SPEC_IB (1 << TIF_SPEC_IB)
+#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_NOTSC (1 << TIF_NOTSC)
@@ -148,8 +152,18 @@ struct thread_info {
_TIF_NOHZ)

/* flags to check in __switch_to() */
-#define _TIF_WORK_CTXSW \
- (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD)
+#define _TIF_WORK_CTXSW_BASE \
+ (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP| \
+ _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE)
+
+/*
+ * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated.
+ */
+#ifdef CONFIG_SMP
+# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE | _TIF_SPEC_IB)
+#else
+# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE)
+#endif

#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 72cfe3e53af1..8dab88b85785 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -68,8 +68,12 @@ static inline void invpcid_flush_all_nonglobals(void)
struct tlb_state {
struct mm_struct *active_mm;
int state;
- /* last user mm's ctx id */
- u64 last_ctx_id;
+
+ /* Last user mm for optimizing IBPB */
+ union {
+ struct mm_struct *last_user_mm;
+ unsigned long last_user_mm_ibpb;
+ };

/*
* Access to this CR4 shadow and to H/W CR4 is protected by
diff --git a/arch/x86/include/uapi/asm/Kbuild b/arch/x86/include/uapi/asm/Kbuild
index 3dec769cadf7..1c532b3f18ea 100644
--- a/arch/x86/include/uapi/asm/Kbuild
+++ b/arch/x86/include/uapi/asm/Kbuild
@@ -27,7 +27,6 @@ header-y += ldt.h
header-y += mce.h
header-y += mman.h
header-y += msgbuf.h
-header-y += msr-index.h
header-y += msr.h
header-y += mtrr.h
header-y += param.h
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 03429da2fa80..83b9be4e0492 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -26,6 +26,10 @@ struct mce {
__u32 socketid; /* CPU socket ID */
__u32 apicid; /* CPU initial apic ID */
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
+ __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
+ __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
+ __u32 microcode;/* Microcode revision */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 621bc6561189..2017fa20611c 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -13,6 +13,7 @@
#include <linux/module.h>
#include <linux/nospec.h>
#include <linux/prctl.h>
+#include <linux/sched/smt.h>

#include <asm/spec-ctrl.h>
#include <asm/cmdline.h>
@@ -23,6 +24,7 @@
#include <asm/msr.h>
#include <asm/paravirt.h>
#include <asm/alternative.h>
+#include <asm/hypervisor.h>
#include <asm/pgtable.h>
#include <asm/cacheflush.h>
#include <asm/intel-family.h>
@@ -31,13 +33,12 @@
static void __init spectre_v2_select_mitigation(void);
static void __init ssb_select_mitigation(void);
static void __init l1tf_select_mitigation(void);
+static void __init mds_select_mitigation(void);

-/*
- * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any
- * writes to SPEC_CTRL contain whatever reserved bits have been set.
- */
+/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
u64 x86_spec_ctrl_base;
EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
+static DEFINE_MUTEX(spec_ctrl_mutex);

/*
* The vendor and possibly platform specific bits which can be modified in
@@ -52,6 +53,19 @@ static u64 x86_spec_ctrl_mask = SPEC_CTRL_IBRS;
u64 x86_amd_ls_cfg_base;
u64 x86_amd_ls_cfg_ssbd_mask;

+/* Control conditional STIPB in switch_to() */
+DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+/* Control conditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+/* Control unconditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+
+/* Control MDS CPU buffer clear before returning to user space */
+DEFINE_STATIC_KEY_FALSE(mds_user_clear);
+/* Control MDS CPU buffer clear before idling (halt, mwait) */
+DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
+EXPORT_SYMBOL_GPL(mds_idle_clear);
+
void __init check_bugs(void)
{
identify_boot_cpu();
@@ -84,6 +98,10 @@ void __init check_bugs(void)

l1tf_select_mitigation();

+ mds_select_mitigation();
+
+ arch_smt_update();
+
#ifdef CONFIG_X86_32
/*
* Check whether we are able to run this kernel safely on SMP.
@@ -116,29 +134,6 @@ void __init check_bugs(void)
#endif
}

-/* The kernel command line selection */
-enum spectre_v2_mitigation_cmd {
- SPECTRE_V2_CMD_NONE,
- SPECTRE_V2_CMD_AUTO,
- SPECTRE_V2_CMD_FORCE,
- SPECTRE_V2_CMD_RETPOLINE,
- SPECTRE_V2_CMD_RETPOLINE_GENERIC,
- SPECTRE_V2_CMD_RETPOLINE_AMD,
-};
-
-static const char *spectre_v2_strings[] = {
- [SPECTRE_V2_NONE] = "Vulnerable",
- [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline",
- [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline",
- [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
- [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
-};
-
-#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 : " fmt
-
-static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
-
void
x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
{
@@ -156,9 +151,14 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;

/* SSBD controlled in MSR_SPEC_CTRL */
- if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
+ if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+ static_cpu_has(X86_FEATURE_AMD_SSBD))
hostval |= ssbd_tif_to_spec_ctrl(ti->flags);

+ /* Conditional STIBP enabled? */
+ if (static_branch_unlikely(&switch_to_cond_stibp))
+ hostval |= stibp_tif_to_spec_ctrl(ti->flags);
+
if (hostval != guestval) {
msrval = setguest ? guestval : hostval;
wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
@@ -192,7 +192,7 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) :
ssbd_spec_ctrl_to_tif(hostval);

- speculative_store_bypass_update(tif);
+ speculation_ctrl_update(tif);
}
}
EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl);
@@ -207,6 +207,57 @@ static void x86_amd_ssb_disable(void)
wrmsrl(MSR_AMD64_LS_CFG, msrval);
}

+#undef pr_fmt
+#define pr_fmt(fmt) "MDS: " fmt
+
+/* Default mitigation for MDS-affected CPUs */
+static enum mds_mitigations mds_mitigation = MDS_MITIGATION_FULL;
+
+static const char * const mds_strings[] = {
+ [MDS_MITIGATION_OFF] = "Vulnerable",
+ [MDS_MITIGATION_FULL] = "Mitigation: Clear CPU buffers",
+ [MDS_MITIGATION_VMWERV] = "Vulnerable: Clear CPU buffers attempted, no microcode",
+};
+
+static void __init mds_select_mitigation(void)
+{
+ if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) {
+ mds_mitigation = MDS_MITIGATION_OFF;
+ return;
+ }
+
+ if (mds_mitigation == MDS_MITIGATION_FULL) {
+ if (!boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ mds_mitigation = MDS_MITIGATION_VMWERV;
+ static_branch_enable(&mds_user_clear);
+ }
+ pr_info("%s\n", mds_strings[mds_mitigation]);
+}
+
+static int __init mds_cmdline(char *str)
+{
+ if (!boot_cpu_has_bug(X86_BUG_MDS))
+ return 0;
+
+ if (!str)
+ return -EINVAL;
+
+ if (!strcmp(str, "off"))
+ mds_mitigation = MDS_MITIGATION_OFF;
+ else if (!strcmp(str, "full"))
+ mds_mitigation = MDS_MITIGATION_FULL;
+
+ return 0;
+}
+early_param("mds", mds_cmdline);
+
+#undef pr_fmt
+#define pr_fmt(fmt) "Spectre V2 : " fmt
+
+static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
+
+static enum spectre_v2_user_mitigation spectre_v2_user = SPECTRE_V2_USER_NONE;
+
#ifdef RETPOLINE
static bool spectre_v2_bad_module;

@@ -228,67 +279,224 @@ static inline const char *spectre_v2_module_string(void)
static inline const char *spectre_v2_module_string(void) { return ""; }
#endif

-static void __init spec2_print_if_insecure(const char *reason)
+static inline bool match_option(const char *arg, int arglen, const char *opt)
{
- if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s selected on command line.\n", reason);
+ int len = strlen(opt);
+
+ return len == arglen && !strncmp(arg, opt, len);
}

-static void __init spec2_print_if_secure(const char *reason)
+/* The kernel command line selection for spectre v2 */
+enum spectre_v2_mitigation_cmd {
+ SPECTRE_V2_CMD_NONE,
+ SPECTRE_V2_CMD_AUTO,
+ SPECTRE_V2_CMD_FORCE,
+ SPECTRE_V2_CMD_RETPOLINE,
+ SPECTRE_V2_CMD_RETPOLINE_GENERIC,
+ SPECTRE_V2_CMD_RETPOLINE_AMD,
+};
+
+enum spectre_v2_user_cmd {
+ SPECTRE_V2_USER_CMD_NONE,
+ SPECTRE_V2_USER_CMD_AUTO,
+ SPECTRE_V2_USER_CMD_FORCE,
+ SPECTRE_V2_USER_CMD_PRCTL,
+ SPECTRE_V2_USER_CMD_PRCTL_IBPB,
+ SPECTRE_V2_USER_CMD_SECCOMP,
+ SPECTRE_V2_USER_CMD_SECCOMP_IBPB,
+};
+
+static const char * const spectre_v2_user_strings[] = {
+ [SPECTRE_V2_USER_NONE] = "User space: Vulnerable",
+ [SPECTRE_V2_USER_STRICT] = "User space: Mitigation: STIBP protection",
+ [SPECTRE_V2_USER_PRCTL] = "User space: Mitigation: STIBP via prctl",
+ [SPECTRE_V2_USER_SECCOMP] = "User space: Mitigation: STIBP via seccomp and prctl",
+};
+
+static const struct {
+ const char *option;
+ enum spectre_v2_user_cmd cmd;
+ bool secure;
+} v2_user_options[] __initconst = {
+ { "auto", SPECTRE_V2_USER_CMD_AUTO, false },
+ { "off", SPECTRE_V2_USER_CMD_NONE, false },
+ { "on", SPECTRE_V2_USER_CMD_FORCE, true },
+ { "prctl", SPECTRE_V2_USER_CMD_PRCTL, false },
+ { "prctl,ibpb", SPECTRE_V2_USER_CMD_PRCTL_IBPB, false },
+ { "seccomp", SPECTRE_V2_USER_CMD_SECCOMP, false },
+ { "seccomp,ibpb", SPECTRE_V2_USER_CMD_SECCOMP_IBPB, false },
+};
+
+static void __init spec_v2_user_print_cond(const char *reason, bool secure)
{
- if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s selected on command line.\n", reason);
+ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
+ pr_info("spectre_v2_user=%s forced on command line.\n", reason);
}

-static inline bool retp_compiler(void)
+static enum spectre_v2_user_cmd __init
+spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
{
- return __is_defined(RETPOLINE);
+ char arg[20];
+ int ret, i;
+
+ switch (v2_cmd) {
+ case SPECTRE_V2_CMD_NONE:
+ return SPECTRE_V2_USER_CMD_NONE;
+ case SPECTRE_V2_CMD_FORCE:
+ return SPECTRE_V2_USER_CMD_FORCE;
+ default:
+ break;
+ }
+
+ ret = cmdline_find_option(boot_command_line, "spectre_v2_user",
+ arg, sizeof(arg));
+ if (ret < 0)
+ return SPECTRE_V2_USER_CMD_AUTO;
+
+ for (i = 0; i < ARRAY_SIZE(v2_user_options); i++) {
+ if (match_option(arg, ret, v2_user_options[i].option)) {
+ spec_v2_user_print_cond(v2_user_options[i].option,
+ v2_user_options[i].secure);
+ return v2_user_options[i].cmd;
+ }
+ }
+
+ pr_err("Unknown user space protection option (%s). Switching to AUTO select\n", arg);
+ return SPECTRE_V2_USER_CMD_AUTO;
}

-static inline bool match_option(const char *arg, int arglen, const char *opt)
+static void __init
+spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
{
- int len = strlen(opt);
+ enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE;
+ bool smt_possible = IS_ENABLED(CONFIG_SMP);
+ enum spectre_v2_user_cmd cmd;

- return len == arglen && !strncmp(arg, opt, len);
+ if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP))
+ return;
+
+ if (!IS_ENABLED(CONFIG_SMP))
+ smt_possible = false;
+
+ cmd = spectre_v2_parse_user_cmdline(v2_cmd);
+ switch (cmd) {
+ case SPECTRE_V2_USER_CMD_NONE:
+ goto set_mode;
+ case SPECTRE_V2_USER_CMD_FORCE:
+ mode = SPECTRE_V2_USER_STRICT;
+ break;
+ case SPECTRE_V2_USER_CMD_PRCTL:
+ case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
+ mode = SPECTRE_V2_USER_PRCTL;
+ break;
+ case SPECTRE_V2_USER_CMD_AUTO:
+ case SPECTRE_V2_USER_CMD_SECCOMP:
+ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
+ if (IS_ENABLED(CONFIG_SECCOMP))
+ mode = SPECTRE_V2_USER_SECCOMP;
+ else
+ mode = SPECTRE_V2_USER_PRCTL;
+ break;
+ }
+
+ /* Initialize Indirect Branch Prediction Barrier */
+ if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
+
+ switch (cmd) {
+ case SPECTRE_V2_USER_CMD_FORCE:
+ case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
+ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
+ static_branch_enable(&switch_mm_always_ibpb);
+ break;
+ case SPECTRE_V2_USER_CMD_PRCTL:
+ case SPECTRE_V2_USER_CMD_AUTO:
+ case SPECTRE_V2_USER_CMD_SECCOMP:
+ static_branch_enable(&switch_mm_cond_ibpb);
+ break;
+ default:
+ break;
+ }
+
+ pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n",
+ static_key_enabled(&switch_mm_always_ibpb) ?
+ "always-on" : "conditional");
+ }
+
+ /* If enhanced IBRS is enabled no STIPB required */
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return;
+
+ /*
+ * If SMT is not possible or STIBP is not available clear the STIPB
+ * mode.
+ */
+ if (!smt_possible || !boot_cpu_has(X86_FEATURE_STIBP))
+ mode = SPECTRE_V2_USER_NONE;
+set_mode:
+ spectre_v2_user = mode;
+ /* Only print the STIBP mode when SMT possible */
+ if (smt_possible)
+ pr_info("%s\n", spectre_v2_user_strings[mode]);
}

+static const char * const spectre_v2_strings[] = {
+ [SPECTRE_V2_NONE] = "Vulnerable",
+ [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline",
+ [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline",
+ [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
+ [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
+ [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS",
+};
+
static const struct {
const char *option;
enum spectre_v2_mitigation_cmd cmd;
bool secure;
-} mitigation_options[] = {
- { "off", SPECTRE_V2_CMD_NONE, false },
- { "on", SPECTRE_V2_CMD_FORCE, true },
- { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
- { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
- { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
- { "auto", SPECTRE_V2_CMD_AUTO, false },
+} mitigation_options[] __initconst = {
+ { "off", SPECTRE_V2_CMD_NONE, false },
+ { "on", SPECTRE_V2_CMD_FORCE, true },
+ { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
+ { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
+ { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
+ { "auto", SPECTRE_V2_CMD_AUTO, false },
};

+static void __init spec_v2_print_cond(const char *reason, bool secure)
+{
+ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
+ pr_info("%s selected on command line.\n", reason);
+}
+
+static inline bool retp_compiler(void)
+{
+ return __is_defined(RETPOLINE);
+}
+
static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
{
+ enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
char arg[20];
int ret, i;
- enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;

- if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+ if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") ||
+ cpu_mitigations_off())
return SPECTRE_V2_CMD_NONE;
- else {
- ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
- if (ret < 0)
- return SPECTRE_V2_CMD_AUTO;

- for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
- if (!match_option(arg, ret, mitigation_options[i].option))
- continue;
- cmd = mitigation_options[i].cmd;
- break;
- }
+ ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
+ if (ret < 0)
+ return SPECTRE_V2_CMD_AUTO;

- if (i >= ARRAY_SIZE(mitigation_options)) {
- pr_err("unknown option (%s). Switching to AUTO select\n", arg);
- return SPECTRE_V2_CMD_AUTO;
- }
+ for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+ if (!match_option(arg, ret, mitigation_options[i].option))
+ continue;
+ cmd = mitigation_options[i].cmd;
+ break;
+ }
+
+ if (i >= ARRAY_SIZE(mitigation_options)) {
+ pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+ return SPECTRE_V2_CMD_AUTO;
}

if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||
@@ -305,11 +513,8 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
return SPECTRE_V2_CMD_AUTO;
}

- if (mitigation_options[i].secure)
- spec2_print_if_secure(mitigation_options[i].option);
- else
- spec2_print_if_insecure(mitigation_options[i].option);
-
+ spec_v2_print_cond(mitigation_options[i].option,
+ mitigation_options[i].secure);
return cmd;
}

@@ -332,6 +537,13 @@ static void __init spectre_v2_select_mitigation(void)

case SPECTRE_V2_CMD_FORCE:
case SPECTRE_V2_CMD_AUTO:
+ if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
+ mode = SPECTRE_V2_IBRS_ENHANCED;
+ /* Force it so VMEXIT will restore correctly */
+ x86_spec_ctrl_base |= SPEC_CTRL_IBRS;
+ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+ goto specv2_set_mode;
+ }
if (IS_ENABLED(CONFIG_RETPOLINE))
goto retpoline_auto;
break;
@@ -369,6 +581,7 @@ retpoline_auto:
setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
}

+specv2_set_mode:
spectre_v2_enabled = mode;
pr_info("%s\n", spectre_v2_strings[mode]);

@@ -383,20 +596,114 @@ retpoline_auto:
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");

- /* Initialize Indirect Branch Prediction Barrier if supported */
- if (boot_cpu_has(X86_FEATURE_IBPB)) {
- setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
- pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
- }
-
/*
* Retpoline means the kernel is safe because it has no indirect
- * branches. But firmware isn't, so use IBRS to protect that.
+ * branches. Enhanced IBRS protects firmware too, so, enable restricted
+ * speculation around firmware calls only when Enhanced IBRS isn't
+ * supported.
+ *
+ * Use "mode" to check Enhanced IBRS instead of boot_cpu_has(), because
+ * the user might select retpoline on the kernel command line and if
+ * the CPU supports Enhanced IBRS, kernel might un-intentionally not
+ * enable IBRS around firmware calls.
*/
- if (boot_cpu_has(X86_FEATURE_IBRS)) {
+ if (boot_cpu_has(X86_FEATURE_IBRS) && mode != SPECTRE_V2_IBRS_ENHANCED) {
setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
pr_info("Enabling Restricted Speculation for firmware calls\n");
}
+
+ /* Set up IBPB and STIBP depending on the general spectre V2 command */
+ spectre_v2_user_select_mitigation(cmd);
+}
+
+static void update_stibp_msr(void * __unused)
+{
+ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+/* Update x86_spec_ctrl_base in case SMT state changed. */
+static void update_stibp_strict(void)
+{
+ u64 mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
+
+ if (sched_smt_active())
+ mask |= SPEC_CTRL_STIBP;
+
+ if (mask == x86_spec_ctrl_base)
+ return;
+
+ pr_info("Update user space SMT mitigation: STIBP %s\n",
+ mask & SPEC_CTRL_STIBP ? "always-on" : "off");
+ x86_spec_ctrl_base = mask;
+ on_each_cpu(update_stibp_msr, NULL, 1);
+}
+
+/* Update the static key controlling the evaluation of TIF_SPEC_IB */
+static void update_indir_branch_cond(void)
+{
+ if (sched_smt_active())
+ static_branch_enable(&switch_to_cond_stibp);
+ else
+ static_branch_disable(&switch_to_cond_stibp);
+}
+
+#undef pr_fmt
+#define pr_fmt(fmt) fmt
+
+/* Update the static key controlling the MDS CPU buffer clear in idle */
+static void update_mds_branch_idle(void)
+{
+ /*
+ * Enable the idle clearing if SMT is active on CPUs which are
+ * affected only by MSBDS and not any other MDS variant.
+ *
+ * The other variants cannot be mitigated when SMT is enabled, so
+ * clearing the buffers on idle just to prevent the Store Buffer
+ * repartitioning leak would be a window dressing exercise.
+ */
+ if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
+ return;
+
+ if (sched_smt_active())
+ static_branch_enable(&mds_idle_clear);
+ else
+ static_branch_disable(&mds_idle_clear);
+}
+
+#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+
+void arch_smt_update(void)
+{
+ /* Enhanced IBRS implies STIBP. No update required. */
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return;
+
+ mutex_lock(&spec_ctrl_mutex);
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ break;
+ case SPECTRE_V2_USER_STRICT:
+ update_stibp_strict();
+ break;
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ update_indir_branch_cond();
+ break;
+ }
+
+ switch (mds_mitigation) {
+ case MDS_MITIGATION_FULL:
+ case MDS_MITIGATION_VMWERV:
+ if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
+ pr_warn_once(MDS_MSG_SMT);
+ update_mds_branch_idle();
+ break;
+ case MDS_MITIGATION_OFF:
+ break;
+ }
+
+ mutex_unlock(&spec_ctrl_mutex);
}

#undef pr_fmt
@@ -413,7 +720,7 @@ enum ssb_mitigation_cmd {
SPEC_STORE_BYPASS_CMD_SECCOMP,
};

-static const char *ssb_strings[] = {
+static const char * const ssb_strings[] = {
[SPEC_STORE_BYPASS_NONE] = "Vulnerable",
[SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled",
[SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl",
@@ -423,7 +730,7 @@ static const char *ssb_strings[] = {
static const struct {
const char *option;
enum ssb_mitigation_cmd cmd;
-} ssb_mitigation_options[] = {
+} ssb_mitigation_options[] __initconst = {
{ "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
{ "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */
{ "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
@@ -437,7 +744,8 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void)
char arg[20];
int ret, i;

- if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) {
+ if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") ||
+ cpu_mitigations_off()) {
return SPEC_STORE_BYPASS_CMD_NONE;
} else {
ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable",
@@ -507,18 +815,16 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void)
if (mode == SPEC_STORE_BYPASS_DISABLE) {
setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
/*
- * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
- * a completely different MSR and bit dependent on family.
+ * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
+ * use a completely different MSR and bit dependent on family.
*/
- switch (boot_cpu_data.x86_vendor) {
- case X86_VENDOR_INTEL:
+ if (!static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) &&
+ !static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+ x86_amd_ssb_disable();
+ } else {
x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
- break;
- case X86_VENDOR_AMD:
- x86_amd_ssb_disable();
- break;
}
}

@@ -536,10 +842,25 @@ static void ssb_select_mitigation(void)
#undef pr_fmt
#define pr_fmt(fmt) "Speculation prctl: " fmt

-static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+static void task_update_spec_tif(struct task_struct *tsk)
{
- bool update;
+ /* Force the update of the real TIF bits */
+ set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE);
+
+ /*
+ * Immediately update the speculation control MSRs for the current
+ * task, but for a non-current task delay setting the CPU
+ * mitigation until it is scheduled next.
+ *
+ * This can only happen for SECCOMP mitigation. For PRCTL it's
+ * always the current task.
+ */
+ if (tsk == current)
+ speculation_ctrl_update_current();
+}

+static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
if (ssb_mode != SPEC_STORE_BYPASS_PRCTL &&
ssb_mode != SPEC_STORE_BYPASS_SECCOMP)
return -ENXIO;
@@ -550,28 +871,56 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
if (task_spec_ssb_force_disable(task))
return -EPERM;
task_clear_spec_ssb_disable(task);
- update = test_and_clear_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
case PR_SPEC_DISABLE:
task_set_spec_ssb_disable(task);
- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
case PR_SPEC_FORCE_DISABLE:
task_set_spec_ssb_disable(task);
task_set_spec_ssb_force_disable(task);
- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
default:
return -ERANGE;
}
+ return 0;
+}

- /*
- * If being set on non-current task, delay setting the CPU
- * mitigation until it is next scheduled.
- */
- if (task == current && update)
- speculative_store_bypass_update_current();
-
+static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
+ switch (ctrl) {
+ case PR_SPEC_ENABLE:
+ if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+ return 0;
+ /*
+ * Indirect branch speculation is always disabled in strict
+ * mode.
+ */
+ if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+ return -EPERM;
+ task_clear_spec_ib_disable(task);
+ task_update_spec_tif(task);
+ break;
+ case PR_SPEC_DISABLE:
+ case PR_SPEC_FORCE_DISABLE:
+ /*
+ * Indirect branch speculation is always allowed when
+ * mitigation is force disabled.
+ */
+ if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+ return -EPERM;
+ if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+ return 0;
+ task_set_spec_ib_disable(task);
+ if (ctrl == PR_SPEC_FORCE_DISABLE)
+ task_set_spec_ib_force_disable(task);
+ task_update_spec_tif(task);
+ break;
+ default:
+ return -ERANGE;
+ }
return 0;
}

@@ -581,6 +930,8 @@ int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which,
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssb_prctl_set(task, ctrl);
+ case PR_SPEC_INDIRECT_BRANCH:
+ return ib_prctl_set(task, ctrl);
default:
return -ENODEV;
}
@@ -591,6 +942,8 @@ void arch_seccomp_spec_mitigate(struct task_struct *task)
{
if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP)
ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE);
+ if (spectre_v2_user == SPECTRE_V2_USER_SECCOMP)
+ ib_prctl_set(task, PR_SPEC_FORCE_DISABLE);
}
#endif

@@ -613,11 +966,35 @@ static int ssb_prctl_get(struct task_struct *task)
}
}

+static int ib_prctl_get(struct task_struct *task)
+{
+ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+ return PR_SPEC_NOT_AFFECTED;
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ return PR_SPEC_ENABLE;
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ if (task_spec_ib_force_disable(task))
+ return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
+ if (task_spec_ib_disable(task))
+ return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
+ return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
+ case SPECTRE_V2_USER_STRICT:
+ return PR_SPEC_DISABLE;
+ default:
+ return PR_SPEC_NOT_AFFECTED;
+ }
+}
+
int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssb_prctl_get(task);
+ case PR_SPEC_INDIRECT_BRANCH:
+ return ib_prctl_get(task);
default:
return -ENODEV;
}
@@ -694,16 +1071,66 @@ static void __init l1tf_select_mitigation(void)
pr_info("You may make it effective by booting the kernel with mem=%llu parameter.\n",
half_pa);
pr_info("However, doing so will make a part of your RAM unusable.\n");
- pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html might help you decide.\n");
+ pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html might help you decide.\n");
return;
}

setup_force_cpu_cap(X86_FEATURE_L1TF_PTEINV);
}
#undef pr_fmt
+#define pr_fmt(fmt) fmt

#ifdef CONFIG_SYSFS

+static ssize_t mds_show_state(char *buf)
+{
+#ifdef CONFIG_HYPERVISOR_GUEST
+ if (x86_hyper) {
+ return sprintf(buf, "%s; SMT Host state unknown\n",
+ mds_strings[mds_mitigation]);
+ }
+#endif
+
+ if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) {
+ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+ (mds_mitigation == MDS_MITIGATION_OFF ? "vulnerable" :
+ sched_smt_active() ? "mitigated" : "disabled"));
+ }
+
+ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+ sched_smt_active() ? "vulnerable" : "disabled");
+}
+
+static char *stibp_state(void)
+{
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return "";
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ return ", STIBP: disabled";
+ case SPECTRE_V2_USER_STRICT:
+ return ", STIBP: forced";
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ if (static_key_enabled(&switch_to_cond_stibp))
+ return ", STIBP: conditional";
+ }
+ return "";
+}
+
+static char *ibpb_state(void)
+{
+ if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ if (static_key_enabled(&switch_mm_always_ibpb))
+ return ", IBPB: always-on";
+ if (static_key_enabled(&switch_mm_cond_ibpb))
+ return ", IBPB: conditional";
+ return ", IBPB: disabled";
+ }
+ return "";
+}
+
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
char *buf, unsigned int bug)
{
@@ -721,9 +1148,11 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
return sprintf(buf, "Mitigation: __user pointer sanitization\n");

case X86_BUG_SPECTRE_V2:
- return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
- boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+ return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ ibpb_state(),
boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+ stibp_state(),
+ boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
spectre_v2_module_string());

case X86_BUG_SPEC_STORE_BYPASS:
@@ -731,9 +1160,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr

case X86_BUG_L1TF:
if (boot_cpu_has(X86_FEATURE_L1TF_PTEINV))
- return sprintf(buf, "Mitigation: Page Table Inversion\n");
+ return sprintf(buf, "Mitigation: PTE Inversion\n");
break;

+ case X86_BUG_MDS:
+ return mds_show_state(buf);
+
default:
break;
}
@@ -765,4 +1197,9 @@ ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *b
{
return cpu_show_common(dev, attr, buf, X86_BUG_L1TF);
}
+
+ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
+}
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e8b46f575306..4bce77bc7e61 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -709,6 +709,12 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_STIBP);
set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
}
+
+ if (cpu_has(c, X86_FEATURE_AMD_SSBD)) {
+ set_cpu_cap(c, X86_FEATURE_SSBD);
+ set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
+ clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD);
+ }
}

void get_cpu_cap(struct cpuinfo_x86 *c)
@@ -841,81 +847,95 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#endif
}

-static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_CENTAUR, 5 },
- { X86_VENDOR_INTEL, 5 },
- { X86_VENDOR_NSC, 5 },
- { X86_VENDOR_ANY, 4 },
+#define NO_SPECULATION BIT(0)
+#define NO_MELTDOWN BIT(1)
+#define NO_SSB BIT(2)
+#define NO_L1TF BIT(3)
+#define NO_MDS BIT(4)
+#define MSBDS_ONLY BIT(5)
+
+#define VULNWL(_vendor, _family, _model, _whitelist) \
+ { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
+
+#define VULNWL_INTEL(model, whitelist) \
+ VULNWL(INTEL, 6, INTEL_FAM6_##model, whitelist)
+
+#define VULNWL_AMD(family, whitelist) \
+ VULNWL(AMD, family, X86_MODEL_ANY, whitelist)
+
+static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ VULNWL(ANY, 4, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(CENTAUR, 5, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(INTEL, 5, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION),
+
+ /* Intel Family 6 */
+ VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
+
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY),
+
+ VULNWL_INTEL(CORE_YONAH, NO_SSB),
+
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY),
+
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF),
+
+ /* AMD Family 0xf - 0x12 */
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+
+ /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS),
{}
};

-static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
- { X86_VENDOR_AMD },
- {}
-};
-
-static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
- { X86_VENDOR_CENTAUR, 5, },
- { X86_VENDOR_INTEL, 5, },
- { X86_VENDOR_NSC, 5, },
- { X86_VENDOR_AMD, 0x12, },
- { X86_VENDOR_AMD, 0x11, },
- { X86_VENDOR_AMD, 0x10, },
- { X86_VENDOR_AMD, 0xf, },
- { X86_VENDOR_ANY, 4, },
- {}
-};
+static bool __init cpu_matches(unsigned long which)
+{
+ const struct x86_cpu_id *m = x86_match_cpu(cpu_vuln_whitelist);

-static const __initconst struct x86_cpu_id cpu_no_l1tf[] = {
- /* in addition to cpu_no_speculation */
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MOOREFIELD },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_DENVERTON },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GEMINI_LAKE },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
- {}
-};
+ return m && !!(m->driver_data & which);
+}

static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
{
u64 ia32_cap = 0;

+ if (cpu_matches(NO_SPECULATION))
+ return;
+
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+
if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);

- if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
- !(ia32_cap & ARCH_CAP_SSB_NO))
+ if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
+ !cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);

- if (x86_match_cpu(cpu_no_speculation))
- return;
+ if (ia32_cap & ARCH_CAP_IBRS_ALL)
+ setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);

- setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) {
+ setup_force_cpu_bug(X86_BUG_MDS);
+ if (cpu_matches(MSBDS_ONLY))
+ setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
+ }

- if (x86_match_cpu(cpu_no_meltdown))
+ if (cpu_matches(NO_MELTDOWN))
return;

/* Rogue Data Cache Load? No! */
@@ -924,7 +944,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)

setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);

- if (x86_match_cpu(cpu_no_l1tf))
+ if (cpu_matches(NO_L1TF))
return;

setup_force_cpu_bug(X86_BUG_L1TF);
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index b18fe3d245fe..b0e0c7a12e61 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -14,6 +14,7 @@
#include <asm/bugs.h>
#include <asm/cpu.h>
#include <asm/intel-family.h>
+#include <asm/microcode_intel.h>

#ifdef CONFIG_X86_64
#include <linux/topology.h>
@@ -102,14 +103,8 @@ static void early_init_intel(struct cpuinfo_x86 *c)
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

- if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) {
- unsigned lower_word;
-
- wrmsr(MSR_IA32_UCODE_REV, 0, 0);
- /* Required by the SDM */
- sync_core();
- rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode);
- }
+ if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64))
+ c->microcode = intel_get_microcode_revision();

/* Now if any of them are set, check the blacklist and clear the lot */
if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) ||
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 9c682c222071..1ce85ba50005 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -132,6 +132,11 @@ static struct severity {
SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
USER
),
+ MCESEV(
+ PANIC, "Instruction fetch error in kernel",
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
+ KERNEL
+ ),
#endif
MCESEV(
PANIC, "Action required: unknown MCACOD",
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 77f7580e22c6..4b9cfdcc3aaa 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -138,6 +138,8 @@ void mce_setup(struct mce *m)
m->socketid = cpu_data(m->extcpu).phys_proc_id;
m->apicid = cpu_data(m->extcpu).initial_apicid;
rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+
+ m->microcode = boot_cpu_data.microcode;
}

DEFINE_PER_CPU(struct mce, injectm);
@@ -258,7 +260,7 @@ static void print_mce(struct mce *m)
*/
pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n",
m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
- cpu_data(m->extcpu).microcode);
+ m->microcode);

/*
* Print out human-readable details about the MCE error,
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 6da6f9cd6d2d..ca5b45799264 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -695,22 +695,26 @@ int apply_microcode_amd(int cpu)
return -1;

/* need to apply patch? */
- if (rev >= mc_amd->hdr.patch_id) {
- c->microcode = rev;
- uci->cpu_sig.rev = rev;
- return 0;
- }
+ if (rev >= mc_amd->hdr.patch_id)
+ goto out;

if (__apply_microcode_amd(mc_amd)) {
pr_err("CPU%d: update failed for patch_level=0x%08x\n",
cpu, mc_amd->hdr.patch_id);
return -1;
}
- pr_info("CPU%d: new patch_level=0x%08x\n", cpu,
- mc_amd->hdr.patch_id);

- uci->cpu_sig.rev = mc_amd->hdr.patch_id;
- c->microcode = mc_amd->hdr.patch_id;
+ rev = mc_amd->hdr.patch_id;
+
+ pr_info("CPU%d: new patch_level=0x%08x\n", cpu, rev);
+
+out:
+ uci->cpu_sig.rev = rev;
+ c->microcode = rev;
+
+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;

return 0;
}
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 2f38a99cdb98..afaf648386e9 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -376,15 +376,8 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
native_rdmsr(MSR_IA32_PLATFORM_ID, val[0], val[1]);
csig.pf = 1 << ((val[1] >> 18) & 7);
}
- native_wrmsr(MSR_IA32_UCODE_REV, 0, 0);

- /* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
-
- /* get the current revision from MSR 0x8B */
- native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
-
- csig.rev = val[1];
+ csig.rev = intel_get_microcode_revision();

uci->cpu_sig = csig;
uci->valid = 1;
@@ -654,31 +647,37 @@ static inline void print_ucode(struct ucode_cpu_info *uci)
static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
{
struct microcode_intel *mc_intel;
- unsigned int val[2];
+ u32 rev;

mc_intel = uci->mc;
if (mc_intel == NULL)
return 0;

+ /*
+ * Save us the MSR write below - which is a particular expensive
+ * operation - when the other hyperthread has updated the microcode
+ * already.
+ */
+ rev = intel_get_microcode_revision();
+ if (rev >= mc_intel->hdr.rev) {
+ uci->cpu_sig.rev = rev;
+ return 0;
+ }
+
/* write microcode via MSR 0x79 */
native_wrmsr(MSR_IA32_UCODE_WRITE,
(unsigned long) mc_intel->bits,
(unsigned long) mc_intel->bits >> 16 >> 16);
- native_wrmsr(MSR_IA32_UCODE_REV, 0, 0);
-
- /* As documented in the SDM: Do a CPUID 1 here */
- sync_core();

- /* get the current revision from MSR 0x8B */
- native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
- if (val[1] != mc_intel->hdr.rev)
+ rev = intel_get_microcode_revision();
+ if (rev != mc_intel->hdr.rev)
return -1;

#ifdef CONFIG_X86_64
/* Flush global tlb. This is precaution. */
flush_tlb_early();
#endif
- uci->cpu_sig.rev = val[1];
+ uci->cpu_sig.rev = rev;

if (early)
print_ucode(uci);
@@ -852,7 +851,7 @@ static int apply_microcode_intel(int cpu)
{
struct microcode_intel *mc_intel;
struct ucode_cpu_info *uci;
- unsigned int val[2];
+ u32 rev;
int cpu_num = raw_smp_processor_id();
struct cpuinfo_x86 *c = &cpu_data(cpu_num);

@@ -873,31 +872,40 @@ static int apply_microcode_intel(int cpu)
if (get_matching_mc(mc_intel, cpu) == 0)
return 0;

+ /*
+ * Save us the MSR write below - which is a particular expensive
+ * operation - when the other hyperthread has updated the microcode
+ * already.
+ */
+ rev = intel_get_microcode_revision();
+ if (rev >= mc_intel->hdr.rev)
+ goto out;
+
/* write microcode via MSR 0x79 */
wrmsr(MSR_IA32_UCODE_WRITE,
(unsigned long) mc_intel->bits,
(unsigned long) mc_intel->bits >> 16 >> 16);
- wrmsr(MSR_IA32_UCODE_REV, 0, 0);
-
- /* As documented in the SDM: Do a CPUID 1 here */
- sync_core();

- /* get the current revision from MSR 0x8B */
- rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
+ rev = intel_get_microcode_revision();

- if (val[1] != mc_intel->hdr.rev) {
+ if (rev != mc_intel->hdr.rev) {
pr_err("CPU%d update to revision 0x%x failed\n",
cpu_num, mc_intel->hdr.rev);
return -1;
}
pr_info("CPU%d updated to revision 0x%x, date = %04x-%02x-%02x\n",
- cpu_num, val[1],
+ cpu_num, rev,
mc_intel->hdr.date & 0xffff,
mc_intel->hdr.date >> 24,
(mc_intel->hdr.date >> 16) & 0xff);

- uci->cpu_sig.rev = val[1];
- c->microcode = val[1];
+out:
+ uci->cpu_sig.rev = rev;
+ c->microcode = rev;
+
+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;

return 0;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 7b79c80ce029..325ed90511cf 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2513,7 +2513,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
return ret;

if (event->attr.precise_ip) {
- if (!event->attr.freq) {
+ if (!(event->attr.freq || event->attr.wakeup_events)) {
event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
if (!(event->attr.sample_type &
~intel_pmu_free_running_flags(event)))
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 697f90db0e37..a4df15f3878e 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -29,6 +29,7 @@
#include <asm/mach_traps.h>
#include <asm/nmi.h>
#include <asm/x86_init.h>
+#include <asm/nospec-branch.h>

#define CREATE_TRACE_POINTS
#include <trace/events/nmi.h>
@@ -522,6 +523,9 @@ nmi_restart:
write_cr2(this_cpu_read(nmi_cr2));
if (this_cpu_dec_return(nmi_state))
goto nmi_restart;
+
+ if (user_mode(regs))
+ mds_user_clear_cpu_buffers();
}
NOKPROBE_SYMBOL(do_nmi);

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index e18c8798c3a2..64090c943f05 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -33,6 +33,8 @@
#include <asm/vm86.h>
#include <asm/spec-ctrl.h>

+#include "process.h"
+
/*
* per-CPU TSS segments. Threads are completely 'soft' on Linux,
* no more per-task TSS's. The TSS size is kept cacheline-aligned
@@ -179,11 +181,12 @@ int set_tsc_mode(unsigned int val)
return 0;
}

-static inline void switch_to_bitmap(struct tss_struct *tss,
- struct thread_struct *prev,
+static inline void switch_to_bitmap(struct thread_struct *prev,
struct thread_struct *next,
unsigned long tifp, unsigned long tifn)
{
+ struct tss_struct *tss = this_cpu_ptr(&cpu_tss);
+
if (tifn & _TIF_IO_BITMAP) {
/*
* Copy the relevant range of the IO bitmap.
@@ -317,32 +320,85 @@ static __always_inline void amd_set_ssb_virt_state(unsigned long tifn)
wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn));
}

-static __always_inline void intel_set_ssb_state(unsigned long tifn)
+/*
+ * Update the MSRs managing speculation control, during context switch.
+ *
+ * tifp: Previous task's thread flags
+ * tifn: Next task's thread flags
+ */
+static __always_inline void __speculation_ctrl_update(unsigned long tifp,
+ unsigned long tifn)
{
- u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn);
+ unsigned long tif_diff = tifp ^ tifn;
+ u64 msr = x86_spec_ctrl_base;
+ bool updmsr = false;
+
+ /*
+ * If TIF_SSBD is different, select the proper mitigation
+ * method. Note that if SSBD mitigation is disabled or permanentely
+ * enabled this branch can't be taken because nothing can set
+ * TIF_SSBD.
+ */
+ if (tif_diff & _TIF_SSBD) {
+ if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) {
+ amd_set_ssb_virt_state(tifn);
+ } else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) {
+ amd_set_core_ssb_state(tifn);
+ } else if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+ static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+ msr |= ssbd_tif_to_spec_ctrl(tifn);
+ updmsr = true;
+ }
+ }
+
+ /*
+ * Only evaluate TIF_SPEC_IB if conditional STIBP is enabled,
+ * otherwise avoid the MSR write.
+ */
+ if (IS_ENABLED(CONFIG_SMP) &&
+ static_branch_unlikely(&switch_to_cond_stibp)) {
+ updmsr |= !!(tif_diff & _TIF_SPEC_IB);
+ msr |= stibp_tif_to_spec_ctrl(tifn);
+ }

- wrmsrl(MSR_IA32_SPEC_CTRL, msr);
+ if (updmsr)
+ wrmsrl(MSR_IA32_SPEC_CTRL, msr);
}

-static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
+static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)
{
- if (static_cpu_has(X86_FEATURE_VIRT_SSBD))
- amd_set_ssb_virt_state(tifn);
- else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD))
- amd_set_core_ssb_state(tifn);
- else
- intel_set_ssb_state(tifn);
+ if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) {
+ if (task_spec_ssb_disable(tsk))
+ set_tsk_thread_flag(tsk, TIF_SSBD);
+ else
+ clear_tsk_thread_flag(tsk, TIF_SSBD);
+
+ if (task_spec_ib_disable(tsk))
+ set_tsk_thread_flag(tsk, TIF_SPEC_IB);
+ else
+ clear_tsk_thread_flag(tsk, TIF_SPEC_IB);
+ }
+ /* Return the updated threadinfo flags*/
+ return task_thread_info(tsk)->flags;
}

-void speculative_store_bypass_update(unsigned long tif)
+void speculation_ctrl_update(unsigned long tif)
{
+ /* Forced update. Make sure all relevant TIF flags are different */
preempt_disable();
- __speculative_store_bypass_update(tif);
+ __speculation_ctrl_update(~tif, tif);
preempt_enable();
}

-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
- struct tss_struct *tss)
+/* Called from seccomp/prctl update */
+void speculation_ctrl_update_current(void)
+{
+ preempt_disable();
+ speculation_ctrl_update(speculation_ctrl_update_tif(current));
+ preempt_enable();
+}
+
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
{
struct thread_struct *prev, *next;
unsigned long tifp, tifn;
@@ -352,7 +408,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,

tifn = READ_ONCE(task_thread_info(next_p)->flags);
tifp = READ_ONCE(task_thread_info(prev_p)->flags);
- switch_to_bitmap(tss, prev, next, tifp, tifn);
+ switch_to_bitmap(prev, next, tifp, tifn);

propagate_user_return_notify(prev_p, next_p);

@@ -370,8 +426,15 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
if ((tifp ^ tifn) & _TIF_NOTSC)
cr4_toggle_bits(X86_CR4_TSD);

- if ((tifp ^ tifn) & _TIF_SSBD)
- __speculative_store_bypass_update(tifn);
+ if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) {
+ __speculation_ctrl_update(tifp, tifn);
+ } else {
+ speculation_ctrl_update_tif(prev_p);
+ tifn = speculation_ctrl_update_tif(next_p);
+
+ /* Enforce MSR update to ensure consistent state */
+ __speculation_ctrl_update(~tifn, tifn);
+ }
}

/*
diff --git a/arch/x86/kernel/process.h b/arch/x86/kernel/process.h
new file mode 100644
index 000000000000..898e97cf6629
--- /dev/null
+++ b/arch/x86/kernel/process.h
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Code shared between 32 and 64 bit
+
+#include <asm/spec-ctrl.h>
+
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p);
+
+/*
+ * This needs to be inline to optimize for the common case where no extra
+ * work needs to be done.
+ */
+static inline void switch_to_extra(struct task_struct *prev,
+ struct task_struct *next)
+{
+ unsigned long next_tif = task_thread_info(next)->flags;
+ unsigned long prev_tif = task_thread_info(prev)->flags;
+
+ if (IS_ENABLED(CONFIG_SMP)) {
+ /*
+ * Avoid __switch_to_xtra() invocation when conditional
+ * STIPB is disabled and the only different bit is
+ * TIF_SPEC_IB. For CONFIG_SMP=n TIF_SPEC_IB is not
+ * in the TIF_WORK_CTXSW masks.
+ */
+ if (!static_branch_likely(&switch_to_cond_stibp)) {
+ prev_tif &= ~_TIF_SPEC_IB;
+ next_tif &= ~_TIF_SPEC_IB;
+ }
+ }
+
+ /*
+ * __switch_to_xtra() handles debug registers, i/o bitmaps,
+ * speculation mitigations etc.
+ */
+ if (unlikely(next_tif & _TIF_WORK_CTXSW_NEXT ||
+ prev_tif & _TIF_WORK_CTXSW_PREV))
+ __switch_to_xtra(prev, next);
+}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 9f950917528b..85b112efac30 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -55,6 +55,8 @@
#include <asm/switch_to.h>
#include <asm/vm86.h>

+#include "process.h"
+
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread");

@@ -279,12 +281,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
set_iopl_mask(next->iopl);

- /*
- * Now maybe handle debug registers and/or IO bitmaps
- */
- if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
- task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
- __switch_to_xtra(prev_p, next_p, tss);
+ switch_to_extra(prev_p, next_p);

/*
* Leave lazy mode, flushing any hypercalls made here.
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index c7cc81e9bb84..618565fecb1c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -50,6 +50,8 @@
#include <asm/switch_to.h>
#include <asm/xen/hypervisor.h>

+#include "process.h"
+
asmlinkage extern void ret_from_fork(void);

__visible DEFINE_PER_CPU(unsigned long, rsp_scratch);
@@ -406,12 +408,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/* Reload esp0 and ss1. This changes current_thread_info(). */
load_sp0(tss, next);

- /*
- * Now maybe reload the debug registers and handle I/O bitmaps
- */
- if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
- task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
- __switch_to_xtra(prev_p, next_p, tss);
+ switch_to_extra(prev_p, next_p);

#ifdef CONFIG_XEN
/*
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 8c73bf1492b8..6223929fc621 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -61,6 +61,7 @@
#include <asm/alternative.h>
#include <asm/fpu/xstate.h>
#include <asm/trace/mpx.h>
+#include <asm/nospec-branch.h>
#include <asm/mpx.h>
#include <asm/vm86.h>

@@ -337,6 +338,13 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
regs->ip = (unsigned long)general_protection;
regs->sp = (unsigned long)&normal_regs->orig_ax;

+ /*
+ * This situation can be triggered by userspace via
+ * modify_ldt(2) and the return does not take the regular
+ * user space exit, so a CPU buffer clear is required when
+ * MDS mitigation is enabled.
+ */
+ mds_user_clear_cpu_buffers();
return;
}
#endif
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b857bb9f6f23..53918abccbc3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -343,7 +343,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,

/* cpuid 0x80000008.ebx */
const u32 kvm_cpuid_8000_0008_ebx_x86_features =
- F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD);
+ F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | F(VIRT_SSBD) |
+ F(AMD_SSB_NO) | F(AMD_STIBP);

/* cpuid 0xC0000001.edx */
const u32 kvm_supported_word5_x86_features =
@@ -364,7 +365,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,

/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
- F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES);
+ F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) |
+ F(INTEL_STIBP) | F(MD_CLEAR);

/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
@@ -607,7 +609,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
entry->ebx |= F(VIRT_SSBD);
entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
- if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD))
+ /*
+ * The preference is to use SPEC CTRL MSR instead of the
+ * VIRT_SPEC MSR.
+ */
+ if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) &&
+ !boot_cpu_has(X86_FEATURE_AMD_SSBD))
entry->ebx |= F(VIRT_SSBD);
break;
}
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 72f159f4d456..8c28926dc900 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -175,7 +175,7 @@ static inline bool guest_cpuid_has_spec_ctrl(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *best;

best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0);
- if (best && (best->ebx & bit(X86_FEATURE_AMD_IBRS)))
+ if (best && (best->ebx & (bit(X86_FEATURE_AMD_IBRS | bit(X86_FEATURE_AMD_SSBD)))))
return true;
best = kvm_find_cpuid_entry(vcpu, 7, 0);
return best && (best->edx & (bit(X86_FEATURE_SPEC_CTRL) | bit(X86_FEATURE_SPEC_CTRL_SSBD)));
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index acbde1249b6f..9fc536657492 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3197,7 +3197,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
return 1;

/* The STIBP bit doesn't fault even if it's not advertised */
- if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD))
return 1;

svm->spec_ctrl = data;
@@ -3928,8 +3928,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)

clgi();

- local_irq_enable();
-
/*
* If this vCPU has touched SPEC_CTRL, restore the guest's value if
* it's non-zero. Since vmentry is serialising on affected CPUs, there
@@ -3938,6 +3936,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);

+ local_irq_enable();
+
asm volatile (
"push %%" _ASM_BP "; \n\t"
"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
@@ -4060,12 +4060,12 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);

- x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
-
reload_tss(vcpu);

local_irq_disable();

+ x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+
vcpu->arch.cr2 = svm->vmcb->save.cr2;
vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax;
vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index ab9ae67a80e4..0ec94c6b4757 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -434,13 +434,13 @@ TRACE_EVENT(kvm_apic_ipi,
);

TRACE_EVENT(kvm_apic_accept_irq,
- TP_PROTO(__u32 apicid, __u16 dm, __u8 tm, __u8 vec),
+ TP_PROTO(__u32 apicid, __u16 dm, __u16 tm, __u8 vec),
TP_ARGS(apicid, dm, tm, vec),

TP_STRUCT__entry(
__field( __u32, apicid )
__field( __u16, dm )
- __field( __u8, tm )
+ __field( __u16, tm )
__field( __u8, vec )
),

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 706c5d63a53f..d830a0d60ba4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2972,6 +2972,10 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
| KVM_VCPUEVENT_VALID_SMM))
return -EINVAL;

+ if (events->exception.injected &&
+ (events->exception.nr > 31 || events->exception.nr == NMI_VECTOR))
+ return -EINVAL;
+
/* INITs are latched while in SMM */
if (events->flags & KVM_VCPUEVENT_VALID_SMM &&
(events->smi.smm || events->smi.pending) &&
diff --git a/arch/x86/mm/kaiser.c b/arch/x86/mm/kaiser.c
index 7a72e32e4806..2cbcd6f3317d 100644
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -10,6 +10,7 @@
#include <linux/mm.h>
#include <linux/uaccess.h>
#include <linux/ftrace.h>
+#include <linux/cpu.h>

#undef pr_fmt
#define pr_fmt(fmt) "Kernel/User page tables isolation: " fmt
@@ -297,7 +298,8 @@ void __init kaiser_check_boottime_disable(void)
goto skip;
}

- if (cmdline_find_option_bool(boot_command_line, "nopti"))
+ if (cmdline_find_option_bool(boot_command_line, "nopti") ||
+ cpu_mitigations_off())
goto disable;

skip:
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 55c7446311a7..50f75768aadd 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -247,7 +247,7 @@ static void pgd_mop_up_pmds(struct mm_struct *mm, pgd_t *pgdp)
if (pgd_val(pgd) != 0) {
pmd_t *pmd = (pmd_t *)pgd_page_vaddr(pgd);

- pgdp[i] = native_make_pgd(0);
+ pgd_clear(&pgdp[i]);

paravirt_release_pmd(pgd_val(pgd) >> PAGE_SHIFT);
pmd_free(mm, pmd);
@@ -424,7 +424,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma,
int changed = !pte_same(*ptep, entry);

if (changed && dirty) {
- *ptep = entry;
+ set_pte(ptep, entry);
pte_update_defer(vma->vm_mm, address, ptep);
}

@@ -441,7 +441,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma,
VM_BUG_ON(address & ~HPAGE_PMD_MASK);

if (changed && dirty) {
- *pmdp = entry;
+ set_pmd(pmdp, entry);
pmd_update_defer(vma->vm_mm, address, pmdp);
/*
* We had a write-protection fault here and changed the pmd
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 6d683bbb3502..f3237e4cb18f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -30,6 +30,12 @@
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/

+/*
+ * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is
+ * stored in cpu_tlb_state.last_user_mm_ibpb.
+ */
+#define LAST_USER_MM_IBPB 0x1UL
+
atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);

struct flush_tlb_info {
@@ -101,41 +107,101 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
local_irq_restore(flags);
}

-void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
- struct task_struct *tsk)
+static inline unsigned long mm_mangle_tif_spec_ib(struct task_struct *next)
{
- unsigned cpu = smp_processor_id();
+ unsigned long next_tif = task_thread_info(next)->flags;
+ unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB;

- if (likely(prev != next)) {
- u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
+ return (unsigned long)next->mm | ibpb;
+}
+
+static void cond_ibpb(struct task_struct *next)
+{
+ if (!next || !next->mm)
+ return;
+
+ /*
+ * Both, the conditional and the always IBPB mode use the mm
+ * pointer to avoid the IBPB when switching between tasks of the
+ * same process. Using the mm pointer instead of mm->context.ctx_id
+ * opens a hypothetical hole vs. mm_struct reuse, which is more or
+ * less impossible to control by an attacker. Aside of that it
+ * would only affect the first schedule so the theoretically
+ * exposed data is not really interesting.
+ */
+ if (static_branch_likely(&switch_mm_cond_ibpb)) {
+ unsigned long prev_mm, next_mm;

/*
- * Avoid user/user BTB poisoning by flushing the branch
- * predictor when switching between processes. This stops
- * one process from doing Spectre-v2 attacks on another.
+ * This is a bit more complex than the always mode because
+ * it has to handle two cases:
+ *
+ * 1) Switch from a user space task (potential attacker)
+ * which has TIF_SPEC_IB set to a user space task
+ * (potential victim) which has TIF_SPEC_IB not set.
+ *
+ * 2) Switch from a user space task (potential attacker)
+ * which has TIF_SPEC_IB not set to a user space task
+ * (potential victim) which has TIF_SPEC_IB set.
+ *
+ * This could be done by unconditionally issuing IBPB when
+ * a task which has TIF_SPEC_IB set is either scheduled in
+ * or out. Though that results in two flushes when:
+ *
+ * - the same user space task is scheduled out and later
+ * scheduled in again and only a kernel thread ran in
+ * between.
+ *
+ * - a user space task belonging to the same process is
+ * scheduled in after a kernel thread ran in between
*
- * As an optimization, flush indirect branches only when
- * switching into processes that disable dumping. This
- * protects high value processes like gpg, without having
- * too high performance overhead. IBPB is *expensive*!
+ * - a user space task belonging to the same process is
+ * scheduled in immediately.
*
- * This will not flush branches when switching into kernel
- * threads. It will also not flush if we switch to idle
- * thread and back to the same process. It will flush if we
- * switch to a different non-dumpable process.
+ * Optimize this with reasonably small overhead for the
+ * above cases. Mangle the TIF_SPEC_IB bit into the mm
+ * pointer of the incoming task which is stored in
+ * cpu_tlbstate.last_user_mm_ibpb for comparison.
*/
- if (tsk && tsk->mm &&
- tsk->mm->context.ctx_id != last_ctx_id &&
- get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ next_mm = mm_mangle_tif_spec_ib(next);
+ prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb);
+
+ /*
+ * Issue IBPB only if the mm's are different and one or
+ * both have the IBPB bit set.
+ */
+ if (next_mm != prev_mm &&
+ (next_mm | prev_mm) & LAST_USER_MM_IBPB)
indirect_branch_prediction_barrier();

+ this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm);
+ }
+
+ if (static_branch_unlikely(&switch_mm_always_ibpb)) {
/*
- * Record last user mm's context id, so we can avoid
- * flushing branch buffer with IBPB if we switch back
- * to the same user.
+ * Only flush when switching to a user space task with a
+ * different context than the user space task which ran
+ * last on this CPU.
+ */
+ if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) {
+ indirect_branch_prediction_barrier();
+ this_cpu_write(cpu_tlbstate.last_user_mm, next->mm);
+ }
+ }
+}
+
+void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
+ struct task_struct *tsk)
+{
+ unsigned cpu = smp_processor_id();
+
+ if (likely(prev != next)) {
+ /*
+ * Avoid user/user BTB poisoning by flushing the branch
+ * predictor when switching between processes. This stops
+ * one process from doing Spectre-v2 attacks on another.
*/
- if (next != &init_mm)
- this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+ cond_ibpb(tsk);

this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
this_cpu_write(cpu_tlbstate.active_mm, next);
diff --git a/drivers/ata/libata-zpodd.c b/drivers/ata/libata-zpodd.c
index 0ad96c647541..7017a81d53cf 100644
--- a/drivers/ata/libata-zpodd.c
+++ b/drivers/ata/libata-zpodd.c
@@ -51,38 +51,52 @@ static int eject_tray(struct ata_device *dev)
/* Per the spec, only slot type and drawer type ODD can be supported */
static enum odd_mech_type zpodd_get_mech_type(struct ata_device *dev)
{
- char buf[16];
+ char *buf;
unsigned int ret;
- struct rm_feature_desc *desc = (void *)(buf + 8);
+ struct rm_feature_desc *desc;
struct ata_taskfile tf;
static const char cdb[] = { GPCMD_GET_CONFIGURATION,
2, /* only 1 feature descriptor requested */
0, 3, /* 3, removable medium feature */
0, 0, 0,/* reserved */
- 0, sizeof(buf),
+ 0, 16,
0, 0, 0,
};

+ buf = kzalloc(16, GFP_KERNEL);
+ if (!buf)
+ return ODD_MECH_TYPE_UNSUPPORTED;
+ desc = (void *)(buf + 8);
+
ata_tf_init(dev, &tf);
tf.flags = ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf.command = ATA_CMD_PACKET;
tf.protocol = ATAPI_PROT_PIO;
- tf.lbam = sizeof(buf);
+ tf.lbam = 16;

ret = ata_exec_internal(dev, &tf, cdb, DMA_FROM_DEVICE,
- buf, sizeof(buf), 0);
- if (ret)
+ buf, 16, 0);
+ if (ret) {
+ kfree(buf);
return ODD_MECH_TYPE_UNSUPPORTED;
+ }

- if (be16_to_cpu(desc->feature_code) != 3)
+ if (be16_to_cpu(desc->feature_code) != 3) {
+ kfree(buf);
return ODD_MECH_TYPE_UNSUPPORTED;
+ }

- if (desc->mech_type == 0 && desc->load == 0 && desc->eject == 1)
+ if (desc->mech_type == 0 && desc->load == 0 && desc->eject == 1) {
+ kfree(buf);
return ODD_MECH_TYPE_SLOT;
- else if (desc->mech_type == 1 && desc->load == 0 && desc->eject == 1)
+ } else if (desc->mech_type == 1 && desc->load == 0 &&
+ desc->eject == 1) {
+ kfree(buf);
return ODD_MECH_TYPE_DRAWER;
- else
+ } else {
+ kfree(buf);
return ODD_MECH_TYPE_UNSUPPORTED;
+ }
}

/* Test if ODD is zero power ready by sense code */
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 41090ef5facb..3934aaf9d157 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -530,11 +530,18 @@ ssize_t __weak cpu_show_l1tf(struct device *dev,
return sprintf(buf, "Not affected\n");
}

+ssize_t __weak cpu_show_mds(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
+static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);

static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@@ -542,6 +549,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_spectre_v2.attr,
&dev_attr_spec_store_bypass.attr,
&dev_attr_l1tf.attr,
+ &dev_attr_mds.attr,
NULL
};

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ae361ee90587..da3902ac16c8 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -82,7 +82,6 @@

static DEFINE_IDR(loop_index_idr);
static DEFINE_MUTEX(loop_index_mutex);
-static DEFINE_MUTEX(loop_ctl_mutex);

static int max_part;
static int part_shift;
@@ -1045,7 +1044,7 @@ static int loop_clr_fd(struct loop_device *lo)
*/
if (atomic_read(&lo->lo_refcnt) > 1) {
lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
return 0;
}

@@ -1094,12 +1093,12 @@ static int loop_clr_fd(struct loop_device *lo)
if (!part_shift)
lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
loop_unprepare_queue(lo);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
/*
- * Need not hold loop_ctl_mutex to fput backing file.
- * Calling fput holding loop_ctl_mutex triggers a circular
+ * Need not hold lo_ctl_mutex to fput backing file.
+ * Calling fput holding lo_ctl_mutex triggers a circular
* lock dependency possibility warning as fput can take
- * bd_mutex which is usually taken before loop_ctl_mutex.
+ * bd_mutex which is usually taken before lo_ctl_mutex.
*/
fput(filp);
return 0;
@@ -1362,7 +1361,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
struct loop_device *lo = bdev->bd_disk->private_data;
int err;

- mutex_lock_nested(&loop_ctl_mutex, 1);
+ mutex_lock_nested(&lo->lo_ctl_mutex, 1);
switch (cmd) {
case LOOP_SET_FD:
err = loop_set_fd(lo, mode, bdev, arg);
@@ -1371,7 +1370,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
err = loop_change_fd(lo, bdev, arg);
break;
case LOOP_CLR_FD:
- /* loop_clr_fd would have unlocked loop_ctl_mutex on success */
+ /* loop_clr_fd would have unlocked lo_ctl_mutex on success */
err = loop_clr_fd(lo);
if (!err)
goto out_unlocked;
@@ -1407,7 +1406,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);

out_unlocked:
return err;
@@ -1540,16 +1539,16 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,

switch(cmd) {
case LOOP_SET_STATUS:
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
err = loop_set_status_compat(
lo, (const struct compat_loop_info __user *) arg);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_GET_STATUS:
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
err = loop_get_status_compat(
lo, (struct compat_loop_info __user *) arg);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_SET_CAPACITY:
case LOOP_CLR_FD:
@@ -1593,7 +1592,7 @@ static void __lo_release(struct loop_device *lo)
if (atomic_dec_return(&lo->lo_refcnt))
return;

- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
/*
* In autoclear mode, stop the loop thread
@@ -1610,7 +1609,7 @@ static void __lo_release(struct loop_device *lo)
loop_flush(lo);
}

- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
}

static void lo_release(struct gendisk *disk, fmode_t mode)
@@ -1656,10 +1655,10 @@ static int unregister_transfer_cb(int id, void *ptr, void *data)
struct loop_device *lo = ptr;
struct loop_func_table *xfer = data;

- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_encryption == xfer)
loop_release_xfer(lo);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
return 0;
}

@@ -1821,6 +1820,7 @@ static int loop_add(struct loop_device **l, int i)
if (!part_shift)
disk->flags |= GENHD_FL_NO_PART_SCAN;
disk->flags |= GENHD_FL_EXT_DEVT;
+ mutex_init(&lo->lo_ctl_mutex);
atomic_set(&lo->lo_refcnt, 0);
lo->lo_number = i;
spin_lock_init(&lo->lo_lock);
@@ -1933,19 +1933,19 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
ret = loop_lookup(&lo, parm);
if (ret < 0)
break;
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_state != Lo_unbound) {
ret = -EBUSY;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
}
if (atomic_read(&lo->lo_refcnt) > 0) {
ret = -EBUSY;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
}
lo->lo_disk->private_data = NULL;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
idr_remove(&loop_index_idr, lo->lo_number);
loop_remove(lo);
break;
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index a923e74495ce..60f0fd2c0c65 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -55,6 +55,7 @@ struct loop_device {

spinlock_t lo_lock;
int lo_state;
+ struct mutex lo_ctl_mutex;
struct kthread_worker worker;
struct task_struct *worker_task;
bool use_dio;
diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
index c4328d9d9981..f838119d12b2 100644
--- a/drivers/block/xsysace.c
+++ b/drivers/block/xsysace.c
@@ -1062,6 +1062,8 @@ static int ace_setup(struct ace_device *ace)
return 0;

err_read:
+ /* prevent double queue cleanup */
+ ace->gd->queue = NULL;
put_disk(ace->gd);
err_alloc_disk:
blk_cleanup_queue(ace->queue);
diff --git a/drivers/gpu/ipu-v3/ipu-dp.c b/drivers/gpu/ipu-v3/ipu-dp.c
index 98686edbcdbb..33de3a1bac49 100644
--- a/drivers/gpu/ipu-v3/ipu-dp.c
+++ b/drivers/gpu/ipu-v3/ipu-dp.c
@@ -195,7 +195,8 @@ int ipu_dp_setup_channel(struct ipu_dp *dp,
ipu_dp_csc_init(flow, flow->foreground.in_cs, flow->out_cs,
DP_COM_CONF_CSC_DEF_BOTH);
} else {
- if (flow->foreground.in_cs == flow->out_cs)
+ if (flow->foreground.in_cs == IPUV3_COLORSPACE_UNKNOWN ||
+ flow->foreground.in_cs == flow->out_cs)
/*
* foreground identical to output, apply color
* conversion on background
@@ -261,6 +262,8 @@ void ipu_dp_disable_channel(struct ipu_dp *dp)
struct ipu_dp_priv *priv = flow->priv;
u32 reg, csc;

+ dp->in_cs = IPUV3_COLORSPACE_UNKNOWN;
+
if (!dp->foreground)
return;

@@ -268,8 +271,9 @@ void ipu_dp_disable_channel(struct ipu_dp *dp)

reg = readl(flow->base + DP_COM_CONF);
csc = reg & DP_COM_CONF_CSC_DEF_MASK;
- if (csc == DP_COM_CONF_CSC_DEF_FG)
- reg &= ~DP_COM_CONF_CSC_DEF_MASK;
+ reg &= ~DP_COM_CONF_CSC_DEF_MASK;
+ if (csc == DP_COM_CONF_CSC_DEF_BOTH || csc == DP_COM_CONF_CSC_DEF_BG)
+ reg |= DP_COM_CONF_CSC_DEF_BG;

reg &= ~DP_COM_CONF_FG_EN;
writel(reg, flow->base + DP_COM_CONF);
@@ -350,6 +354,8 @@ int ipu_dp_init(struct ipu_soc *ipu, struct device *dev, unsigned long base)
mutex_init(&priv->mutex);

for (i = 0; i < IPUV3_NUM_FLOWS; i++) {
+ priv->flow[i].background.in_cs = IPUV3_COLORSPACE_UNKNOWN;
+ priv->flow[i].foreground.in_cs = IPUV3_COLORSPACE_UNKNOWN;
priv->flow[i].foreground.foreground = true;
priv->flow[i].base = priv->base + ipu_dp_flow_base[i];
priv->flow[i].priv = priv;
diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c
index d7179dd3c9ef..3cafa1d28fed 100644
--- a/drivers/hid/hid-debug.c
+++ b/drivers/hid/hid-debug.c
@@ -1058,10 +1058,15 @@ static int hid_debug_rdesc_show(struct seq_file *f, void *p)
seq_printf(f, "\n\n");

/* dump parsed data and input mappings */
+ if (down_interruptible(&hdev->driver_input_lock))
+ return 0;
+
hid_dump_device(hdev, f);
seq_printf(f, "\n");
hid_dump_input_mapping(hdev, f);

+ up(&hdev->driver_input_lock);
+
return 0;
}

diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
index 8d74e691ac90..ee3c66c02043 100644
--- a/drivers/hid/hid-input.c
+++ b/drivers/hid/hid-input.c
@@ -783,6 +783,10 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
case 0x074: map_key_clear(KEY_BRIGHTNESS_MAX); break;
case 0x075: map_key_clear(KEY_BRIGHTNESS_AUTO); break;

+ case 0x079: map_key_clear(KEY_KBDILLUMUP); break;
+ case 0x07a: map_key_clear(KEY_KBDILLUMDOWN); break;
+ case 0x07c: map_key_clear(KEY_KBDILLUMTOGGLE); break;
+
case 0x082: map_key_clear(KEY_VIDEO_NEXT); break;
case 0x083: map_key_clear(KEY_LAST); break;
case 0x084: map_key_clear(KEY_ENTER); break;
@@ -913,6 +917,8 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
case 0x2cb: map_key_clear(KEY_KBDINPUTASSIST_ACCEPT); break;
case 0x2cc: map_key_clear(KEY_KBDINPUTASSIST_CANCEL); break;

+ case 0x29f: map_key_clear(KEY_SCALE); break;
+
default: map_key_clear(KEY_UNKNOWN);
}
break;
diff --git a/drivers/hwtracing/intel_th/gth.c b/drivers/hwtracing/intel_th/gth.c
index eb43943cdf07..189eb6269971 100644
--- a/drivers/hwtracing/intel_th/gth.c
+++ b/drivers/hwtracing/intel_th/gth.c
@@ -597,7 +597,7 @@ static void intel_th_gth_unassign(struct intel_th_device *thdev,
othdev->output.port = -1;
othdev->output.active = false;
gth->output[port].output = NULL;
- for (master = 0; master < TH_CONFIGURABLE_MASTERS; master++)
+ for (master = 0; master <= TH_CONFIGURABLE_MASTERS; master++)
if (gth->master[master] == port)
gth->master[master] = -1;
spin_unlock(&gth->gth_lock);
diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c
index 475c5a74f2d1..6398e86a272b 100644
--- a/drivers/iio/adc/xilinx-xadc-core.c
+++ b/drivers/iio/adc/xilinx-xadc-core.c
@@ -1299,7 +1299,7 @@ static int xadc_remove(struct platform_device *pdev)
}
free_irq(irq, indio_dev);
clk_disable_unprepare(xadc->clk);
- cancel_delayed_work(&xadc->zynq_unmask_work);
+ cancel_delayed_work_sync(&xadc->zynq_unmask_work);
kfree(xadc->data);
kfree(indio_dev->channels);

diff --git a/drivers/input/keyboard/snvs_pwrkey.c b/drivers/input/keyboard/snvs_pwrkey.c
index 9adf13a5864a..57143365e945 100644
--- a/drivers/input/keyboard/snvs_pwrkey.c
+++ b/drivers/input/keyboard/snvs_pwrkey.c
@@ -156,6 +156,9 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
return error;
}

+ pdata->input = input;
+ platform_set_drvdata(pdev, pdata);
+
error = devm_request_irq(&pdev->dev, pdata->irq,
imx_snvs_pwrkey_interrupt,
0, pdev->name, pdev);
@@ -172,9 +175,6 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
return error;
}

- pdata->input = input;
- platform_set_drvdata(pdev, pdata);
-
device_init_wakeup(&pdev->dev, pdata->wakeup);

return 0;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 94f1bf772ec9..db85cc5791dc 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -295,7 +295,7 @@ static void iommu_write_l2(struct amd_iommu *iommu, u8 address, u32 val)
static void iommu_set_exclusion_range(struct amd_iommu *iommu)
{
u64 start = iommu->exclusion_start & PAGE_MASK;
- u64 limit = (start + iommu->exclusion_length) & PAGE_MASK;
+ u64 limit = (start + iommu->exclusion_length - 1) & PAGE_MASK;
u64 entry;

if (!iommu->exclusion_start)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5e65dc6def7e..17517889d46b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3897,26 +3897,15 @@ static void handle_parity_checks6(struct r5conf *conf, struct stripe_head *sh,
case check_state_check_result:
sh->check_state = check_state_idle;

+ if (s->failed > 1)
+ break;
/* handle a successful check operation, if parity is correct
* we are done. Otherwise update the mismatch count and repair
* parity if !MD_RECOVERY_CHECK
*/
if (sh->ops.zero_sum_result == 0) {
- /* both parities are correct */
- if (!s->failed)
- set_bit(STRIPE_INSYNC, &sh->state);
- else {
- /* in contrast to the raid5 case we can validate
- * parity, but still have a failure to write
- * back
- */
- sh->check_state = check_state_compute_result;
- /* Returning at this point means that we may go
- * off and bring p and/or q uptodate again so
- * we make sure to check zero_sum_result again
- * to verify if p or q need writeback
- */
- }
+ /* Any parity checked was correct */
+ set_bit(STRIPE_INSYNC, &sh->state);
} else {
atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches);
if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
diff --git a/drivers/media/i2c/ov7670.c b/drivers/media/i2c/ov7670.c
index e1b5dc84c14e..24a0c21a3d8d 100644
--- a/drivers/media/i2c/ov7670.c
+++ b/drivers/media/i2c/ov7670.c
@@ -155,10 +155,10 @@ MODULE_PARM_DESC(debug, "Debug level (0-1)");
#define REG_GFIX 0x69 /* Fix gain control */

#define REG_DBLV 0x6b /* PLL control an debugging */
-#define DBLV_BYPASS 0x00 /* Bypass PLL */
-#define DBLV_X4 0x01 /* clock x4 */
-#define DBLV_X6 0x10 /* clock x6 */
-#define DBLV_X8 0x11 /* clock x8 */
+#define DBLV_BYPASS 0x0a /* Bypass PLL */
+#define DBLV_X4 0x4a /* clock x4 */
+#define DBLV_X6 0x8a /* clock x6 */
+#define DBLV_X8 0xca /* clock x8 */

#define REG_REG76 0x76 /* OV's name */
#define R76_BLKPCOR 0x80 /* Black pixel correction enable */
@@ -833,7 +833,7 @@ static int ov7675_set_framerate(struct v4l2_subdev *sd,
if (ret < 0)
return ret;

- return ov7670_write(sd, REG_DBLV, DBLV_X4);
+ return 0;
}

static void ov7670_get_framerate_legacy(struct v4l2_subdev *sd,
@@ -1578,11 +1578,7 @@ static int ov7670_probe(struct i2c_client *client,
if (config->clock_speed)
info->clock_speed = config->clock_speed;

- /*
- * It should be allowed for ov7670 too when it is migrated to
- * the new frame rate formula.
- */
- if (config->pll_bypass && id->driver_data != MODEL_OV7670)
+ if (config->pll_bypass)
info->pll_bypass = true;

if (config->pclk_hb_disable)
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 66560a8fcfa2..1022e80aaf97 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -1066,13 +1066,6 @@ static int bond_option_arp_validate_set(struct bonding *bond,
{
netdev_info(bond->dev, "Setting arp_validate to %s (%llu)\n",
newval->string, newval->value);
-
- if (bond->dev->flags & IFF_UP) {
- if (!newval->value)
- bond->recv_probe = NULL;
- else if (bond->params.arp_interval)
- bond->recv_probe = bond_arp_rcv;
- }
bond->params.arp_validate = newval->value;

return 0;
diff --git a/drivers/net/bonding/bond_sysfs_slave.c b/drivers/net/bonding/bond_sysfs_slave.c
index 7d16c51e6913..641a532b67cb 100644
--- a/drivers/net/bonding/bond_sysfs_slave.c
+++ b/drivers/net/bonding/bond_sysfs_slave.c
@@ -55,7 +55,9 @@ static SLAVE_ATTR_RO(link_failure_count);

static ssize_t perm_hwaddr_show(struct slave *slave, char *buf)
{
- return sprintf(buf, "%pM\n", slave->perm_hwaddr);
+ return sprintf(buf, "%*phC\n",
+ slave->dev->addr_len,
+ slave->perm_hwaddr);
}
static SLAVE_ATTR_RO(perm_hwaddr);

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 00bd7be85679..d9ab970dcbe9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4957,8 +4957,15 @@ static int bnxt_cfg_rx_mode(struct bnxt *bp)

skip_uc:
rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0);
+ if (rc && vnic->mc_list_count) {
+ netdev_info(bp->dev, "Failed setting MC filters rc: %d, turning on ALL_MCAST mode\n",
+ rc);
+ vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST;
+ vnic->mc_list_count = 0;
+ rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0);
+ }
if (rc)
- netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %x\n",
+ netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %d\n",
rc);

return rc;
diff --git a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
index 89714f5e0dfc..c8b9a73d6b1b 100644
--- a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
+++ b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
@@ -253,14 +253,12 @@ uec_set_ringparam(struct net_device *netdev,
return -EINVAL;
}

+ if (netif_running(netdev))
+ return -EBUSY;
+
ug_info->bdRingLenRx[queue] = ring->rx_pending;
ug_info->bdRingLenTx[queue] = ring->tx_pending;

- if (netif_running(netdev)) {
- /* FIXME: restart automatically */
- netdev_info(netdev, "Please re-open the interface\n");
- }
-
return ret;
}

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c b/drivers/net/ethernet/hisilicon/hns/hnae.c
index b3645297477e..3ce41efe8a94 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -144,7 +144,6 @@ out_buffer_fail:
/* free desc along with its attached buffer */
static void hnae_free_desc(struct hnae_ring *ring)
{
- hnae_free_buffers(ring);
dma_unmap_single(ring_to_dev(ring), ring->desc_dma_addr,
ring->desc_num * sizeof(ring->desc[0]),
ring_to_dma_dir(ring));
@@ -177,6 +176,9 @@ static int hnae_alloc_desc(struct hnae_ring *ring)
/* fini ring, also free the buffer for the ring */
static void hnae_fini_ring(struct hnae_ring *ring)
{
+ if (is_rx_ring(ring))
+ hnae_free_buffers(ring);
+
hnae_free_desc(ring);
kfree(ring->desc_cb);
ring->desc_cb = NULL;
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 2fa54b0b0679..6d649e7b45a9 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -28,9 +28,6 @@

#define SERVICE_TIMER_HZ (1 * HZ)

-#define NIC_TX_CLEAN_MAX_NUM 256
-#define NIC_RX_CLEAN_MAX_NUM 64
-
#define RCB_IRQ_NOT_INITED 0
#define RCB_IRQ_INITED 1

@@ -1408,7 +1405,7 @@ static int hns_nic_init_ring_data(struct hns_nic_priv *priv)
rd->fini_process = hns_nic_tx_fini_pro;

netif_napi_add(priv->netdev, &rd->napi,
- hns_nic_common_poll, NIC_TX_CLEAN_MAX_NUM);
+ hns_nic_common_poll, NAPI_POLL_WEIGHT);
rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED;
}
for (i = h->q_num; i < h->q_num * 2; i++) {
@@ -1420,7 +1417,7 @@ static int hns_nic_init_ring_data(struct hns_nic_priv *priv)
rd->fini_process = hns_nic_rx_fini_pro;

netif_napi_add(priv->netdev, &rd->napi,
- hns_nic_common_poll, NIC_RX_CLEAN_MAX_NUM);
+ hns_nic_common_poll, NAPI_POLL_WEIGHT);
rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED;
}

diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
index 2a0dc127df3f..1a56de06b014 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -3183,6 +3183,7 @@ static ssize_t ehea_probe_port(struct device *dev,

if (ehea_add_adapter_mr(adapter)) {
pr_err("creating MR failed\n");
+ of_node_put(eth_dn);
return -EIO;
}

diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index b1915043bc0c..7b9fb71137da 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -193,6 +193,8 @@
/* enable link status from external LINK_0 and LINK_1 pins */
#define E1000_CTRL_SWDPIN0 0x00040000 /* SWDPIN 0 value */
#define E1000_CTRL_SWDPIN1 0x00080000 /* SWDPIN 1 value */
+#define E1000_CTRL_ADVD3WUC 0x00100000 /* D3 WUC */
+#define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000 /* PHY PM enable */
#define E1000_CTRL_SDP0_DIR 0x00400000 /* SDP0 Data direction */
#define E1000_CTRL_SDP1_DIR 0x00800000 /* SDP1 Data direction */
#define E1000_CTRL_RST 0x04000000 /* Global reset */
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index c1796aa2dde5..70ed5e5c3514 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7325,9 +7325,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
struct e1000_hw *hw = &adapter->hw;
u32 ctrl, rctl, status;
u32 wufc = runtime ? E1000_WUFC_LNKC : adapter->wol;
-#ifdef CONFIG_PM
- int retval = 0;
-#endif
+ bool wake;

rtnl_lock();
netif_device_detach(netdev);
@@ -7338,14 +7336,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
igb_clear_interrupt_scheme(adapter);
rtnl_unlock();

-#ifdef CONFIG_PM
- if (!runtime) {
- retval = pci_save_state(pdev);
- if (retval)
- return retval;
- }
-#endif
-
status = rd32(E1000_STATUS);
if (status & E1000_STATUS_LU)
wufc &= ~E1000_WUFC_LNKC;
@@ -7362,10 +7352,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
}

ctrl = rd32(E1000_CTRL);
- /* advertise wake from D3Cold */
- #define E1000_CTRL_ADVD3WUC 0x00100000
- /* phy power management enable */
- #define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000
ctrl |= E1000_CTRL_ADVD3WUC;
wr32(E1000_CTRL, ctrl);

@@ -7379,12 +7365,15 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
wr32(E1000_WUFC, 0);
}

- *enable_wake = wufc || adapter->en_mng_pt;
- if (!*enable_wake)
+ wake = wufc || adapter->en_mng_pt;
+ if (!wake)
igb_power_down_link(adapter);
else
igb_power_up_link(adapter);

+ if (enable_wake)
+ *enable_wake = wake;
+
/* Release control of h/w to f/w. If f/w is AMT enabled, this
* would have already happened in close and is redundant.
*/
@@ -7399,22 +7388,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
#ifdef CONFIG_PM_SLEEP
static int igb_suspend(struct device *dev)
{
- int retval;
- bool wake;
- struct pci_dev *pdev = to_pci_dev(dev);
-
- retval = __igb_shutdown(pdev, &wake, 0);
- if (retval)
- return retval;
-
- if (wake) {
- pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- pci_set_power_state(pdev, PCI_D3hot);
- }
-
- return 0;
+ return __igb_shutdown(to_pci_dev(dev), NULL, 0);
}
#endif /* CONFIG_PM_SLEEP */

@@ -7483,22 +7457,7 @@ static int igb_runtime_idle(struct device *dev)

static int igb_runtime_suspend(struct device *dev)
{
- struct pci_dev *pdev = to_pci_dev(dev);
- int retval;
- bool wake;
-
- retval = __igb_shutdown(pdev, &wake, 1);
- if (retval)
- return retval;
-
- if (wake) {
- pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- pci_set_power_state(pdev, PCI_D3hot);
- }
-
- return 0;
+ return __igb_shutdown(to_pci_dev(dev), NULL, 1);
}

static int igb_runtime_resume(struct device *dev)
diff --git a/drivers/net/ethernet/micrel/ks8851.c b/drivers/net/ethernet/micrel/ks8851.c
index 1edc973df4c4..7377dca6eb57 100644
--- a/drivers/net/ethernet/micrel/ks8851.c
+++ b/drivers/net/ethernet/micrel/ks8851.c
@@ -547,9 +547,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks)
/* set dma read address */
ks8851_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI | 0x00);

- /* start the packet dma process, and set auto-dequeue rx */
- ks8851_wrreg16(ks, KS_RXQCR,
- ks->rc_rxqcr | RXQCR_SDA | RXQCR_ADRFE);
+ /* start DMA access */
+ ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_SDA);

if (rxlen > 4) {
unsigned int rxalign;
@@ -580,7 +579,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks)
}
}

- ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr);
+ /* end DMA access and dequeue packet */
+ ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_RRXEF);
}
}

@@ -797,6 +797,15 @@ static void ks8851_tx_work(struct work_struct *work)
static int ks8851_net_open(struct net_device *dev)
{
struct ks8851_net *ks = netdev_priv(dev);
+ int ret;
+
+ ret = request_threaded_irq(dev->irq, NULL, ks8851_irq,
+ IRQF_TRIGGER_LOW | IRQF_ONESHOT,
+ dev->name, ks);
+ if (ret < 0) {
+ netdev_err(dev, "failed to get irq\n");
+ return ret;
+ }

/* lock the card, even if we may not actually be doing anything
* else at the moment */
@@ -861,6 +870,7 @@ static int ks8851_net_open(struct net_device *dev)
netif_dbg(ks, ifup, ks->netdev, "network device up\n");

mutex_unlock(&ks->lock);
+ mii_check_link(&ks->mii);
return 0;
}

@@ -911,6 +921,8 @@ static int ks8851_net_stop(struct net_device *dev)
dev_kfree_skb(txb);
}

+ free_irq(dev->irq, ks);
+
return 0;
}

@@ -1516,6 +1528,7 @@ static int ks8851_probe(struct spi_device *spi)

spi_set_drvdata(spi, ks);

+ netif_carrier_off(ks->netdev);
ndev->if_port = IF_PORT_100BASET;
ndev->netdev_ops = &ks8851_netdev_ops;
ndev->irq = spi->irq;
@@ -1542,14 +1555,6 @@ static int ks8851_probe(struct spi_device *spi)
ks8851_read_selftest(ks);
ks8851_init_mac(ks);

- ret = request_threaded_irq(spi->irq, NULL, ks8851_irq,
- IRQF_TRIGGER_LOW | IRQF_ONESHOT,
- ndev->name, ks);
- if (ret < 0) {
- dev_err(&spi->dev, "failed to get irq\n");
- goto err_irq;
- }
-
ret = register_netdev(ndev);
if (ret) {
dev_err(&spi->dev, "failed to register network device\n");
@@ -1562,14 +1567,10 @@ static int ks8851_probe(struct spi_device *spi)

return 0;

-
err_netdev:
- free_irq(ndev->irq, ks);
-
-err_irq:
+err_id:
if (gpio_is_valid(gpio))
gpio_set_value(gpio, 0);
-err_id:
regulator_disable(ks->vdd_reg);
err_reg:
regulator_disable(ks->vdd_io);
@@ -1587,7 +1588,6 @@ static int ks8851_remove(struct spi_device *spi)
dev_info(&spi->dev, "remove\n");

unregister_netdev(priv->netdev);
- free_irq(spi->irq, priv);
if (gpio_is_valid(priv->gpio))
gpio_set_value(priv->gpio, 0);
regulator_disable(priv->vdd_reg);
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
index 0a2318cad34d..63ebc491057b 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
@@ -1038,6 +1038,8 @@ int qlcnic_do_lb_test(struct qlcnic_adapter *adapter, u8 mode)

for (i = 0; i < QLCNIC_NUM_ILB_PKT; i++) {
skb = netdev_alloc_skb(adapter->netdev, QLCNIC_ILB_PKT_SIZE);
+ if (!skb)
+ break;
qlcnic_create_loopback_buff(skb->data, adapter->mac_addr);
skb_put(skb, QLCNIC_ILB_PKT_SIZE);
adapter->ahw->diag_cnt = 0;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 059113dce6e0..f4d6512f066c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1792,8 +1792,6 @@ static int stmmac_open(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
int ret;

- stmmac_check_ether_addr(priv);
-
if (priv->pcs != STMMAC_PCS_RGMII && priv->pcs != STMMAC_PCS_TBI &&
priv->pcs != STMMAC_PCS_RTBI) {
ret = stmmac_init_phy(dev);
@@ -2929,6 +2927,8 @@ int stmmac_dvr_probe(struct device *device,
if (ret)
goto error_hw_init;

+ stmmac_check_ether_addr(priv);
+
ndev->netdev_ops = &stmmac_netdev_ops;

ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c
index 4e70e7586a09..a5732edc8437 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -3122,12 +3122,16 @@ static int gbe_probe(struct netcp_device *netcp_device, struct device *dev,

ret = netcp_txpipe_init(&gbe_dev->tx_pipe, netcp_device,
gbe_dev->dma_chan_name, gbe_dev->tx_queue_id);
- if (ret)
+ if (ret) {
+ of_node_put(interfaces);
return ret;
+ }

ret = netcp_txpipe_open(&gbe_dev->tx_pipe);
- if (ret)
+ if (ret) {
+ of_node_put(interfaces);
return ret;
+ }

/* Create network interfaces */
INIT_LIST_HEAD(&gbe_dev->gbe_intf_head);
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 4684644703cc..58ba579793f8 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1595,12 +1595,14 @@ static int axienet_probe(struct platform_device *pdev)
ret = of_address_to_resource(np, 0, &dmares);
if (ret) {
dev_err(&pdev->dev, "unable to get DMA resource\n");
+ of_node_put(np);
goto free_netdev;
}
lp->dma_regs = devm_ioremap_resource(&pdev->dev, &dmares);
if (IS_ERR(lp->dma_regs)) {
dev_err(&pdev->dev, "could not map DMA regs\n");
ret = PTR_ERR(lp->dma_regs);
+ of_node_put(np);
goto free_netdev;
}
lp->rx_irq = irq_of_parse_and_map(np, 1);
diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c
index cfd81eb1b532..ddceed3c5a4a 100644
--- a/drivers/net/slip/slhc.c
+++ b/drivers/net/slip/slhc.c
@@ -153,7 +153,7 @@ out_fail:
void
slhc_free(struct slcompress *comp)
{
- if ( comp == NULLSLCOMPR )
+ if ( IS_ERR_OR_NULL(comp) )
return;

if ( comp->tstate != NULLSLSTATE )
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 267a90423154..7b3ef6dc45a4 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1136,6 +1136,12 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
return -EINVAL;
}

+ if (netdev_has_upper_dev(dev, port_dev)) {
+ netdev_err(dev, "Device %s is already an upper device of the team interface\n",
+ portname);
+ return -EBUSY;
+ }
+
if (port_dev->features & NETIF_F_VLAN_CHALLENGED &&
vlan_uses_dev(dev)) {
netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n",
diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c
index f1f8227e7342..01f95d192d25 100644
--- a/drivers/net/usb/ipheth.c
+++ b/drivers/net/usb/ipheth.c
@@ -148,6 +148,7 @@ struct ipheth_device {
u8 bulk_in;
u8 bulk_out;
struct delayed_work carrier_work;
+ bool confirmed_pairing;
};

static int ipheth_rx_submit(struct ipheth_device *dev, gfp_t mem_flags);
@@ -259,7 +260,7 @@ static void ipheth_rcvbulk_callback(struct urb *urb)

dev->net->stats.rx_packets++;
dev->net->stats.rx_bytes += len;
-
+ dev->confirmed_pairing = true;
netif_rx(skb);
ipheth_rx_submit(dev, GFP_ATOMIC);
}
@@ -280,14 +281,24 @@ static void ipheth_sndbulk_callback(struct urb *urb)
dev_err(&dev->intf->dev, "%s: urb status: %d\n",
__func__, status);

- netif_wake_queue(dev->net);
+ if (status == 0)
+ netif_wake_queue(dev->net);
+ else
+ // on URB error, trigger immediate poll
+ schedule_delayed_work(&dev->carrier_work, 0);
}

static int ipheth_carrier_set(struct ipheth_device *dev)
{
- struct usb_device *udev = dev->udev;
+ struct usb_device *udev;
int retval;

+ if (!dev)
+ return 0;
+ if (!dev->confirmed_pairing)
+ return 0;
+
+ udev = dev->udev;
retval = usb_control_msg(udev,
usb_rcvctrlpipe(udev, IPHETH_CTRL_ENDP),
IPHETH_CMD_CARRIER_CHECK, /* request */
@@ -302,11 +313,14 @@ static int ipheth_carrier_set(struct ipheth_device *dev)
return retval;
}

- if (dev->ctrl_buf[0] == IPHETH_CARRIER_ON)
+ if (dev->ctrl_buf[0] == IPHETH_CARRIER_ON) {
netif_carrier_on(dev->net);
- else
+ if (dev->tx_urb->status != -EINPROGRESS)
+ netif_wake_queue(dev->net);
+ } else {
netif_carrier_off(dev->net);
-
+ netif_stop_queue(dev->net);
+ }
return 0;
}

@@ -386,7 +400,6 @@ static int ipheth_open(struct net_device *net)
return retval;

schedule_delayed_work(&dev->carrier_work, IPHETH_CARRIER_CHECK_TIMEOUT);
- netif_start_queue(net);
return retval;
}

@@ -489,7 +502,7 @@ static int ipheth_probe(struct usb_interface *intf,
dev->udev = udev;
dev->net = netdev;
dev->intf = intf;
-
+ dev->confirmed_pairing = false;
/* Set up endpoints */
hintf = usb_altnum_to_altsetting(intf, IPHETH_ALT_INTFNUM);
if (hintf == NULL) {
@@ -540,7 +553,9 @@ static int ipheth_probe(struct usb_interface *intf,
retval = -EIO;
goto err_register_netdev;
}
-
+ // carrier down and transmit queues stopped until packet from device
+ netif_carrier_off(netdev);
+ netif_tx_stop_all_queues(netdev);
dev_info(&intf->dev, "Apple iPhone USB Ethernet device attached\n");
return 0;

diff --git a/drivers/net/wireless/cw1200/scan.c b/drivers/net/wireless/cw1200/scan.c
index 9f1037e7e55c..2ce0193614f2 100644
--- a/drivers/net/wireless/cw1200/scan.c
+++ b/drivers/net/wireless/cw1200/scan.c
@@ -84,8 +84,11 @@ int cw1200_hw_scan(struct ieee80211_hw *hw,

frame.skb = ieee80211_probereq_get(hw, priv->vif->addr, NULL, 0,
req->ie_len);
- if (!frame.skb)
+ if (!frame.skb) {
+ mutex_unlock(&priv->conf_mutex);
+ up(&priv->scan.lock);
return -ENOMEM;
+ }

if (req->ie_len)
memcpy(skb_put(frame.skb, req->ie_len), req->ie, req->ie_len);
diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
index cb477518dd0e..4c129450495d 100644
--- a/drivers/nvdimm/btt_devs.c
+++ b/drivers/nvdimm/btt_devs.c
@@ -170,14 +170,15 @@ static struct device *__nd_btt_create(struct nd_region *nd_region,
return NULL;

nd_btt->id = ida_simple_get(&nd_region->btt_ida, 0, 0, GFP_KERNEL);
- if (nd_btt->id < 0) {
- kfree(nd_btt);
- return NULL;
- }
+ if (nd_btt->id < 0)
+ goto out_nd_btt;

nd_btt->lbasize = lbasize;
- if (uuid)
+ if (uuid) {
uuid = kmemdup(uuid, 16, GFP_KERNEL);
+ if (!uuid)
+ goto out_put_id;
+ }
nd_btt->uuid = uuid;
dev = &nd_btt->dev;
dev_set_name(dev, "btt%d.%d", nd_region->id, nd_btt->id);
@@ -192,6 +193,13 @@ static struct device *__nd_btt_create(struct nd_region *nd_region,
return NULL;
}
return dev;
+
+out_put_id:
+ ida_simple_remove(&nd_region->btt_ida, nd_btt->id);
+
+out_nd_btt:
+ kfree(nd_btt);
+ return NULL;
}

struct device *nd_btt_create(struct nd_region *nd_region)
diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c
index f73c29558cd3..c54ff94c491d 100644
--- a/drivers/platform/x86/sony-laptop.c
+++ b/drivers/platform/x86/sony-laptop.c
@@ -4394,14 +4394,16 @@ sony_pic_read_possible_resource(struct acpi_resource *resource, void *context)
}
return AE_OK;
}
+
+ case ACPI_RESOURCE_TYPE_END_TAG:
+ return AE_OK;
+
default:
dprintk("Resource %d isn't an IRQ nor an IO port\n",
resource->type);
+ return AE_CTRL_TERMINATE;

- case ACPI_RESOURCE_TYPE_END_TAG:
- return AE_OK;
}
- return AE_CTRL_TERMINATE;
}

static int sony_pic_possible_resources(struct acpi_device *device)
diff --git a/drivers/rtc/rtc-da9063.c b/drivers/rtc/rtc-da9063.c
index d6c853bbfa9f..e93beecd5010 100644
--- a/drivers/rtc/rtc-da9063.c
+++ b/drivers/rtc/rtc-da9063.c
@@ -491,6 +491,13 @@ static int da9063_rtc_probe(struct platform_device *pdev)
da9063_data_to_tm(data, &rtc->alarm_time, rtc);
rtc->rtc_sync = false;

+ /*
+ * TODO: some models have alarms on a minute boundary but still support
+ * real hardware interrupts. Add this once the core supports it.
+ */
+ if (config->rtc_data_start != RTC_SEC)
+ rtc->rtc_dev->uie_unsupported = 1;
+
irq_alarm = platform_get_irq_byname(pdev, "ALARM");
ret = devm_request_threaded_irq(&pdev->dev, irq_alarm, NULL,
da9063_alarm_event,
diff --git a/drivers/rtc/rtc-sh.c b/drivers/rtc/rtc-sh.c
index 2b81dd4baf17..104c854d6a8a 100644
--- a/drivers/rtc/rtc-sh.c
+++ b/drivers/rtc/rtc-sh.c
@@ -455,7 +455,7 @@ static int sh_rtc_set_time(struct device *dev, struct rtc_time *tm)
static inline int sh_rtc_read_alarm_value(struct sh_rtc *rtc, int reg_off)
{
unsigned int byte;
- int value = 0xff; /* return 0xff for ignored values */
+ int value = -1; /* return -1 for ignored values */

byte = readb(rtc->regbase + reg_off);
if (byte & AR_ENB) {
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 80a43074c2f9..c530610f61ac 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -2066,14 +2066,14 @@ static int dasd_eckd_end_analysis(struct dasd_block *block)
blk_per_trk = recs_per_track(&private->rdc_data, 0, block->bp_block);

raw:
- block->blocks = (private->real_cyl *
+ block->blocks = ((unsigned long) private->real_cyl *
private->rdc_data.trk_per_cyl *
blk_per_trk);

dev_info(&device->cdev->dev,
- "DASD with %d KB/block, %d KB total size, %d KB/track, "
+ "DASD with %u KB/block, %lu KB total size, %u KB/track, "
"%s\n", (block->bp_block >> 10),
- ((private->real_cyl *
+ (((unsigned long) private->real_cyl *
private->rdc_data.trk_per_cyl *
blk_per_trk * (block->bp_block >> 9)) >> 1),
((blk_per_trk * block->bp_block) >> 10),
diff --git a/drivers/s390/char/con3270.c b/drivers/s390/char/con3270.c
index bae98521c808..3e5a7912044f 100644
--- a/drivers/s390/char/con3270.c
+++ b/drivers/s390/char/con3270.c
@@ -627,7 +627,7 @@ con3270_init(void)
(void (*)(unsigned long)) con3270_read_tasklet,
(unsigned long) condev->read);

- raw3270_add_view(&condev->view, &con3270_fn, 1);
+ raw3270_add_view(&condev->view, &con3270_fn, 1, RAW3270_VIEW_LOCK_IRQ);

INIT_LIST_HEAD(&condev->freemem);
for (i = 0; i < CON3270_STRING_PAGES; i++) {
diff --git a/drivers/s390/char/fs3270.c b/drivers/s390/char/fs3270.c
index 71e974738014..f0c86bcbe316 100644
--- a/drivers/s390/char/fs3270.c
+++ b/drivers/s390/char/fs3270.c
@@ -463,7 +463,8 @@ fs3270_open(struct inode *inode, struct file *filp)

init_waitqueue_head(&fp->wait);
fp->fs_pid = get_pid(task_pid(current));
- rc = raw3270_add_view(&fp->view, &fs3270_fn, minor);
+ rc = raw3270_add_view(&fp->view, &fs3270_fn, minor,
+ RAW3270_VIEW_LOCK_BH);
if (rc) {
fs3270_free_view(&fp->view);
goto out;
diff --git a/drivers/s390/char/raw3270.c b/drivers/s390/char/raw3270.c
index 220acb4cbee5..9c350e6d75bf 100644
--- a/drivers/s390/char/raw3270.c
+++ b/drivers/s390/char/raw3270.c
@@ -956,7 +956,7 @@ raw3270_deactivate_view(struct raw3270_view *view)
* Add view to device with minor "minor".
*/
int
-raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor)
+raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor, int subclass)
{
unsigned long flags;
struct raw3270 *rp;
@@ -978,6 +978,7 @@ raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor)
view->cols = rp->cols;
view->ascebc = rp->ascebc;
spin_lock_init(&view->lock);
+ lockdep_set_subclass(&view->lock, subclass);
list_add(&view->list, &rp->view_list);
rc = 0;
spin_unlock_irqrestore(get_ccwdev_lock(rp->cdev), flags);
diff --git a/drivers/s390/char/raw3270.h b/drivers/s390/char/raw3270.h
index e1e41c2861fb..5ae54317857a 100644
--- a/drivers/s390/char/raw3270.h
+++ b/drivers/s390/char/raw3270.h
@@ -155,6 +155,8 @@ struct raw3270_fn {
struct raw3270_view {
struct list_head list;
spinlock_t lock;
+#define RAW3270_VIEW_LOCK_IRQ 0
+#define RAW3270_VIEW_LOCK_BH 1
atomic_t ref_count;
struct raw3270 *dev;
struct raw3270_fn *fn;
@@ -163,7 +165,7 @@ struct raw3270_view {
unsigned char *ascebc; /* ascii -> ebcdic table */
};

-int raw3270_add_view(struct raw3270_view *, struct raw3270_fn *, int);
+int raw3270_add_view(struct raw3270_view *, struct raw3270_fn *, int, int);
int raw3270_activate_view(struct raw3270_view *);
void raw3270_del_view(struct raw3270_view *);
void raw3270_deactivate_view(struct raw3270_view *);
diff --git a/drivers/s390/char/tty3270.c b/drivers/s390/char/tty3270.c
index e96fc7fd9498..ab95d24b991b 100644
--- a/drivers/s390/char/tty3270.c
+++ b/drivers/s390/char/tty3270.c
@@ -937,7 +937,8 @@ static int tty3270_install(struct tty_driver *driver, struct tty_struct *tty)
return PTR_ERR(tp);

rc = raw3270_add_view(&tp->view, &tty3270_fn,
- tty->index + RAW3270_FIRSTMINOR);
+ tty->index + RAW3270_FIRSTMINOR,
+ RAW3270_VIEW_LOCK_BH);
if (rc) {
tty3270_free_view(tp);
return rc;
diff --git a/drivers/s390/net/ctcm_main.c b/drivers/s390/net/ctcm_main.c
index 05c37d6d4afe..a31821d94677 100644
--- a/drivers/s390/net/ctcm_main.c
+++ b/drivers/s390/net/ctcm_main.c
@@ -1595,6 +1595,7 @@ static int ctcm_new_device(struct ccwgroup_device *cgdev)
if (priv->channel[direction] == NULL) {
if (direction == CTCM_WRITE)
channel_free(priv->channel[CTCM_READ]);
+ result = -ENODEV;
goto out_dev;
}
priv->channel[direction]->netdev = dev;
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 237688af179b..f7630cf581cd 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -238,10 +238,6 @@ static void _zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req, u32 range,
list_for_each_entry(port, &adapter->port_list, list) {
if ((port->d_id & range) == (ntoh24(page->rscn_fid) & range))
zfcp_fc_test_link(port);
- if (!port->d_id)
- zfcp_erp_port_reopen(port,
- ZFCP_STATUS_COMMON_ERP_FAILED,
- "fcrscn1");
}
read_unlock_irqrestore(&adapter->port_list_lock, flags);
}
@@ -249,6 +245,7 @@ static void _zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req, u32 range,
static void zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req)
{
struct fsf_status_read_buffer *status_buffer = (void *)fsf_req->data;
+ struct zfcp_adapter *adapter = fsf_req->adapter;
struct fc_els_rscn *head;
struct fc_els_rscn_page *page;
u16 i;
@@ -261,6 +258,22 @@ static void zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req)
/* see FC-FS */
no_entries = head->rscn_plen / sizeof(struct fc_els_rscn_page);

+ if (no_entries > 1) {
+ /* handle failed ports */
+ unsigned long flags;
+ struct zfcp_port *port;
+
+ read_lock_irqsave(&adapter->port_list_lock, flags);
+ list_for_each_entry(port, &adapter->port_list, list) {
+ if (port->d_id)
+ continue;
+ zfcp_erp_port_reopen(port,
+ ZFCP_STATUS_COMMON_ERP_FAILED,
+ "fcrscn1");
+ }
+ read_unlock_irqrestore(&adapter->port_list_lock, flags);
+ }
+
for (i = 1; i < no_entries; i++) {
/* skip head and start with 1st element */
page++;
diff --git a/drivers/scsi/csiostor/csio_scsi.c b/drivers/scsi/csiostor/csio_scsi.c
index c2a6f9f29427..ddbdaade654d 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -1713,8 +1713,11 @@ csio_scsi_err_handler(struct csio_hw *hw, struct csio_ioreq *req)
}

out:
- if (req->nsge > 0)
+ if (req->nsge > 0) {
scsi_dma_unmap(cmnd);
+ if (req->dcopy && (host_status == DID_OK))
+ host_status = csio_scsi_copy_to_sgl(hw, req);
+ }

cmnd->result = (((host_status) << 16) | scsi_status);
cmnd->scsi_done(cmnd);
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 7be581f7c35d..1a6f65db615e 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -47,17 +47,16 @@ static void smp_task_timedout(unsigned long _task)
unsigned long flags;

spin_lock_irqsave(&task->task_state_lock, flags);
- if (!(task->task_state_flags & SAS_TASK_STATE_DONE))
+ if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
task->task_state_flags |= SAS_TASK_STATE_ABORTED;
+ complete(&task->slow_task->completion);
+ }
spin_unlock_irqrestore(&task->task_state_lock, flags);
-
- complete(&task->slow_task->completion);
}

static void smp_task_done(struct sas_task *task)
{
- if (!del_timer(&task->slow_task->timer))
- return;
+ del_timer(&task->slow_task->timer);
complete(&task->slow_task->completion);
}

diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
index ac12ee844bfc..31c29a5d1f38 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -431,7 +431,7 @@ qla2x00_sysfs_write_optrom_ctl(struct file *filp, struct kobject *kobj,
}

ha->optrom_region_start = start;
- ha->optrom_region_size = start + size;
+ ha->optrom_region_size = size;

ha->optrom_state = QLA_SREADING;
ha->optrom_buffer = vmalloc(ha->optrom_region_size);
@@ -504,7 +504,7 @@ qla2x00_sysfs_write_optrom_ctl(struct file *filp, struct kobject *kobj,
}

ha->optrom_region_start = start;
- ha->optrom_region_size = start + size;
+ ha->optrom_region_size = size;

ha->optrom_state = QLA_SWRITING;
ha->optrom_buffer = vmalloc(ha->optrom_region_size);
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index f9f899ec9427..c158967b59d7 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -3207,6 +3207,8 @@ static int qla4xxx_conn_bind(struct iscsi_cls_session *cls_session,
if (iscsi_conn_bind(cls_session, cls_conn, is_leading))
return -EINVAL;
ep = iscsi_lookup_endpoint(transport_fd);
+ if (!ep)
+ return -EINVAL;
conn = cls_conn->dd_data;
qla_conn = conn->dd_data;
qla_conn->qla_ep = ep->dd_data;
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 44b7a69d022a..45cd4cf93af3 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -613,13 +613,22 @@ static void handle_sc_creation(struct vmbus_channel *new_sc)
static void handle_multichannel_storage(struct hv_device *device, int max_chns)
{
struct storvsc_device *stor_device;
- int num_cpus = num_online_cpus();
int num_sc;
struct storvsc_cmd_request *request;
struct vstor_packet *vstor_packet;
int ret, t;

- num_sc = ((max_chns > num_cpus) ? num_cpus : max_chns);
+ /*
+ * If the number of CPUs is artificially restricted, such as
+ * with maxcpus=1 on the kernel boot line, Hyper-V could offer
+ * sub-channels >= the number of CPUs. These sub-channels
+ * should not be created. The primary channel is already created
+ * and assigned to one CPU, so check against # CPUs - 1.
+ */
+ num_sc = min((int)(num_online_cpus() - 1), max_chns);
+ if (!num_sc)
+ return;
+
stor_device = get_out_stor_device(device);
if (!stor_device)
return;
diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c
index 3adc4516918c..8c5cfb9400d0 100644
--- a/drivers/staging/iio/addac/adt7316.c
+++ b/drivers/staging/iio/addac/adt7316.c
@@ -47,6 +47,8 @@
#define ADT7516_MSB_AIN3 0xA
#define ADT7516_MSB_AIN4 0xB
#define ADT7316_DA_DATA_BASE 0x10
+#define ADT7316_DA_10_BIT_LSB_SHIFT 6
+#define ADT7316_DA_12_BIT_LSB_SHIFT 4
#define ADT7316_DA_MSB_DATA_REGS 4
#define ADT7316_LSB_DAC_A 0x10
#define ADT7316_MSB_DAC_A 0x11
@@ -1092,7 +1094,7 @@ static ssize_t adt7316_store_DAC_internal_Vref(struct device *dev,
ldac_config = chip->ldac_config & (~ADT7516_DAC_IN_VREF_MASK);
if (data & 0x1)
ldac_config |= ADT7516_DAC_AB_IN_VREF;
- else if (data & 0x2)
+ if (data & 0x2)
ldac_config |= ADT7516_DAC_CD_IN_VREF;
} else {
ret = kstrtou8(buf, 16, &data);
@@ -1414,7 +1416,7 @@ static IIO_DEVICE_ATTR(ex_analog_temp_offset, S_IRUGO | S_IWUSR,
static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip,
int channel, char *buf)
{
- u16 data;
+ u16 data = 0;
u8 msb, lsb, offset;
int ret;

@@ -1439,7 +1441,11 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip,
if (ret)
return -EIO;

- data = (msb << offset) + (lsb & ((1 << offset) - 1));
+ if (chip->dac_bits == 12)
+ data = lsb >> ADT7316_DA_12_BIT_LSB_SHIFT;
+ else if (chip->dac_bits == 10)
+ data = lsb >> ADT7316_DA_10_BIT_LSB_SHIFT;
+ data |= msb << offset;

return sprintf(buf, "%d\n", data);
}
@@ -1447,7 +1453,7 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip,
static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip,
int channel, const char *buf, size_t len)
{
- u8 msb, lsb, offset;
+ u8 msb, lsb, lsb_reg, offset;
u16 data;
int ret;

@@ -1465,9 +1471,13 @@ static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip,
return -EINVAL;

if (chip->dac_bits > 8) {
- lsb = data & (1 << offset);
+ lsb = data & ((1 << offset) - 1);
+ if (chip->dac_bits == 12)
+ lsb_reg = lsb << ADT7316_DA_12_BIT_LSB_SHIFT;
+ else
+ lsb_reg = lsb << ADT7316_DA_10_BIT_LSB_SHIFT;
ret = chip->bus.write(chip->bus.client,
- ADT7316_DA_DATA_BASE + channel * 2, lsb);
+ ADT7316_DA_DATA_BASE + channel * 2, lsb_reg);
if (ret)
return -EIO;
}
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 17a22073d226..032f3c13b8c4 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1448,7 +1448,7 @@ static int __init sc16is7xx_init(void)
ret = i2c_add_driver(&sc16is7xx_i2c_uart_driver);
if (ret < 0) {
pr_err("failed to init sc16is7xx i2c --> %d\n", ret);
- return ret;
+ goto err_i2c;
}
#endif

@@ -1456,10 +1456,18 @@ static int __init sc16is7xx_init(void)
ret = spi_register_driver(&sc16is7xx_spi_uart_driver);
if (ret < 0) {
pr_err("failed to init sc16is7xx spi --> %d\n", ret);
- return ret;
+ goto err_spi;
}
#endif
return ret;
+
+err_spi:
+#ifdef CONFIG_SERIAL_SC16IS7XX_I2C
+ i2c_del_driver(&sc16is7xx_i2c_uart_driver);
+#endif
+err_i2c:
+ uart_unregister_driver(&sc16is7xx_uart);
+ return ret;
}
module_init(sc16is7xx_init);

diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index e9d6cf146fcc..654199c6a36c 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -470,11 +470,6 @@ static int usb_unbind_interface(struct device *dev)
pm_runtime_disable(dev);
pm_runtime_set_suspended(dev);

- /* Undo any residual pm_autopm_get_interface_* calls */
- for (r = atomic_read(&intf->pm_usage_cnt); r > 0; --r)
- usb_autopm_put_interface_no_suspend(intf);
- atomic_set(&intf->pm_usage_cnt, 0);
-
if (!error)
usb_autosuspend_device(udev);

@@ -1625,7 +1620,6 @@ void usb_autopm_put_interface(struct usb_interface *intf)
int status;

usb_mark_last_busy(udev);
- atomic_dec(&intf->pm_usage_cnt);
status = pm_runtime_put_sync(&intf->dev);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
@@ -1654,7 +1648,6 @@ void usb_autopm_put_interface_async(struct usb_interface *intf)
int status;

usb_mark_last_busy(udev);
- atomic_dec(&intf->pm_usage_cnt);
status = pm_runtime_put(&intf->dev);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
@@ -1676,7 +1669,6 @@ void usb_autopm_put_interface_no_suspend(struct usb_interface *intf)
struct usb_device *udev = interface_to_usbdev(intf);

usb_mark_last_busy(udev);
- atomic_dec(&intf->pm_usage_cnt);
pm_runtime_put_noidle(&intf->dev);
}
EXPORT_SYMBOL_GPL(usb_autopm_put_interface_no_suspend);
@@ -1707,8 +1699,6 @@ int usb_autopm_get_interface(struct usb_interface *intf)
status = pm_runtime_get_sync(&intf->dev);
if (status < 0)
pm_runtime_put_sync(&intf->dev);
- else
- atomic_inc(&intf->pm_usage_cnt);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
status);
@@ -1742,8 +1732,6 @@ int usb_autopm_get_interface_async(struct usb_interface *intf)
status = pm_runtime_get(&intf->dev);
if (status < 0 && status != -EINPROGRESS)
pm_runtime_put_noidle(&intf->dev);
- else
- atomic_inc(&intf->pm_usage_cnt);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
status);
@@ -1767,7 +1755,6 @@ void usb_autopm_get_interface_no_resume(struct usb_interface *intf)
struct usb_device *udev = interface_to_usbdev(intf);

usb_mark_last_busy(udev);
- atomic_inc(&intf->pm_usage_cnt);
pm_runtime_get_noresume(&intf->dev);
}
EXPORT_SYMBOL_GPL(usb_autopm_get_interface_no_resume);
@@ -1888,14 +1875,11 @@ int usb_runtime_idle(struct device *dev)
return -EBUSY;
}

-int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
+static int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
{
struct usb_hcd *hcd = bus_to_hcd(udev->bus);
int ret = -EPERM;

- if (enable && !udev->usb2_hw_lpm_allowed)
- return 0;
-
if (hcd->driver->set_usb2_hw_lpm) {
ret = hcd->driver->set_usb2_hw_lpm(hcd, udev, enable);
if (!ret)
@@ -1905,6 +1889,24 @@ int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
return ret;
}

+int usb_enable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ if (!udev->usb2_hw_lpm_capable ||
+ !udev->usb2_hw_lpm_allowed ||
+ udev->usb2_hw_lpm_enabled)
+ return 0;
+
+ return usb_set_usb2_hardware_lpm(udev, 1);
+}
+
+int usb_disable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ if (!udev->usb2_hw_lpm_enabled)
+ return 0;
+
+ return usb_set_usb2_hardware_lpm(udev, 0);
+}
+
#endif /* CONFIG_PM */

struct bus_type usb_bus_type = {
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 3a6978458d95..7c87c0b38bcf 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -3116,8 +3116,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
}

/* disable USB2 hardware LPM */
- if (udev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(udev, 0);
+ usb_disable_usb2_hardware_lpm(udev);

if (usb_disable_ltm(udev)) {
dev_err(&udev->dev, "Failed to disable LTM before suspend\n.");
@@ -3163,8 +3162,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
usb_enable_ltm(udev);
err_ltm:
/* Try to enable USB2 hardware LPM again */
- if (udev->usb2_hw_lpm_capable == 1)
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);

if (udev->do_remote_wakeup)
(void) usb_disable_remote_wakeup(udev);
@@ -3443,8 +3441,7 @@ int usb_port_resume(struct usb_device *udev, pm_message_t msg)
hub_port_logical_disconnect(hub, port1);
} else {
/* Try to enable USB2 hardware LPM */
- if (udev->usb2_hw_lpm_capable == 1)
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);

/* Try to enable USB3 LTM and LPM */
usb_enable_ltm(udev);
@@ -4270,7 +4267,7 @@ static void hub_set_initial_usb2_lpm_policy(struct usb_device *udev)
if ((udev->bos->ext_cap->bmAttributes & cpu_to_le32(USB_BESL_SUPPORT)) ||
connect_type == USB_PORT_CONNECT_TYPE_HARD_WIRED) {
udev->usb2_hw_lpm_allowed = 1;
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
}
}

@@ -5415,8 +5412,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
/* Disable USB2 hardware LPM.
* It will be re-enabled by the enumeration process.
*/
- if (udev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(udev, 0);
+ usb_disable_usb2_hardware_lpm(udev);

/* Disable LPM and LTM while we reset the device and reinstall the alt
* settings. Device-initiated LPM settings, and system exit latency
@@ -5526,7 +5522,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)

done:
/* Now that the alt settings are re-installed, enable LTM and LPM. */
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
usb_unlocked_enable_lpm(udev);
usb_enable_ltm(udev);
usb_release_bos_descriptor(udev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 08cba309eb78..adc696a76b20 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -820,9 +820,11 @@ int usb_string(struct usb_device *dev, int index, char *buf, size_t size)

if (dev->state == USB_STATE_SUSPENDED)
return -EHOSTUNREACH;
- if (size <= 0 || !buf || !index)
+ if (size <= 0 || !buf)
return -EINVAL;
buf[0] = 0;
+ if (index <= 0 || index >= 256)
+ return -EINVAL;
tbuf = kmalloc(256, GFP_NOIO);
if (!tbuf)
return -ENOMEM;
@@ -1184,8 +1186,7 @@ void usb_disable_device(struct usb_device *dev, int skip_ep0)
dev->actconfig->interface[i] = NULL;
}

- if (dev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(dev, 0);
+ usb_disable_usb2_hardware_lpm(dev);
usb_unlocked_disable_lpm(dev);
usb_disable_ltm(dev);

diff --git a/drivers/usb/core/sysfs.c b/drivers/usb/core/sysfs.c
index 65b6e6b84043..6dc0f4e25cf3 100644
--- a/drivers/usb/core/sysfs.c
+++ b/drivers/usb/core/sysfs.c
@@ -472,7 +472,10 @@ static ssize_t usb2_hardware_lpm_store(struct device *dev,

if (!ret) {
udev->usb2_hw_lpm_allowed = value;
- ret = usb_set_usb2_hardware_lpm(udev, value);
+ if (value)
+ ret = usb_enable_usb2_hardware_lpm(udev);
+ else
+ ret = usb_disable_usb2_hardware_lpm(udev);
}

usb_unlock_device(udev);
diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h
index 53318126ed91..6b2f11544283 100644
--- a/drivers/usb/core/usb.h
+++ b/drivers/usb/core/usb.h
@@ -84,7 +84,8 @@ extern int usb_remote_wakeup(struct usb_device *dev);
extern int usb_runtime_suspend(struct device *dev);
extern int usb_runtime_resume(struct device *dev);
extern int usb_runtime_idle(struct device *dev);
-extern int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable);
+extern int usb_enable_usb2_hardware_lpm(struct usb_device *udev);
+extern int usb_disable_usb2_hardware_lpm(struct usb_device *udev);

#else

@@ -104,7 +105,12 @@ static inline int usb_autoresume_device(struct usb_device *udev)
return 0;
}

-static inline int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
+static inline int usb_enable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ return 0;
+}
+
+static inline int usb_disable_usb2_hardware_lpm(struct usb_device *udev)
{
return 0;
}
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 22b4797383cd..4378e758baef 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -867,7 +867,7 @@ static int dwc3_probe(struct platform_device *pdev)
dwc->regs_size = resource_size(res);

/* default to highest possible threshold */
- lpm_nyet_threshold = 0xff;
+ lpm_nyet_threshold = 0xf;

/* default to -3.5dB de-emphasis */
tx_de_emphasis = 1;
diff --git a/drivers/usb/gadget/udc/net2272.c b/drivers/usb/gadget/udc/net2272.c
index 3b6e34fc032b..553922c3be85 100644
--- a/drivers/usb/gadget/udc/net2272.c
+++ b/drivers/usb/gadget/udc/net2272.c
@@ -962,6 +962,7 @@ net2272_dequeue(struct usb_ep *_ep, struct usb_request *_req)
break;
}
if (&req->req != _req) {
+ ep->stopped = stopped;
spin_unlock_irqrestore(&ep->dev->lock, flags);
return -EINVAL;
}
diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c
index 8efeadf30b4d..3a8d056a5d16 100644
--- a/drivers/usb/gadget/udc/net2280.c
+++ b/drivers/usb/gadget/udc/net2280.c
@@ -870,9 +870,6 @@ static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
(void) readl(&ep->dev->pci->pcimstctl);

writel(BIT(DMA_START), &dma->dmastat);
-
- if (!ep->is_in)
- stop_out_naking(ep);
}

static void start_dma(struct net2280_ep *ep, struct net2280_request *req)
@@ -911,6 +908,7 @@ static void start_dma(struct net2280_ep *ep, struct net2280_request *req)
writel(BIT(DMA_START), &dma->dmastat);
return;
}
+ stop_out_naking(ep);
}

tmp = dmactl_default;
@@ -1272,9 +1270,9 @@ static int net2280_dequeue(struct usb_ep *_ep, struct usb_request *_req)
break;
}
if (&req->req != _req) {
+ ep->stopped = stopped;
spin_unlock_irqrestore(&ep->dev->lock, flags);
- dev_err(&ep->dev->pdev->dev, "%s: Request mismatch\n",
- __func__);
+ ep_dbg(ep->dev, "%s: Request mismatch\n", __func__);
return -EINVAL;
}

diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c
index d5434e7a3b2e..86f9944f337d 100644
--- a/drivers/usb/host/u132-hcd.c
+++ b/drivers/usb/host/u132-hcd.c
@@ -3214,6 +3214,9 @@ static int __init u132_hcd_init(void)
printk(KERN_INFO "driver %s\n", hcd_name);
workqueue = create_singlethread_workqueue("u132");
retval = platform_driver_register(&u132_platform_driver);
+ if (retval)
+ destroy_workqueue(workqueue);
+
return retval;
}

diff --git a/drivers/usb/misc/yurex.c b/drivers/usb/misc/yurex.c
index 5594a4a4a83f..a8b6d0036e5d 100644
--- a/drivers/usb/misc/yurex.c
+++ b/drivers/usb/misc/yurex.c
@@ -332,6 +332,7 @@ static void yurex_disconnect(struct usb_interface *interface)
usb_deregister_dev(interface, &yurex_class);

/* prevent more I/O from starting */
+ usb_poison_urb(dev->urb);
mutex_lock(&dev->io_mutex);
dev->interface = NULL;
mutex_unlock(&dev->io_mutex);
diff --git a/drivers/usb/serial/generic.c b/drivers/usb/serial/generic.c
index 54e170dd3dad..faead4f32b1c 100644
--- a/drivers/usb/serial/generic.c
+++ b/drivers/usb/serial/generic.c
@@ -350,39 +350,59 @@ void usb_serial_generic_read_bulk_callback(struct urb *urb)
struct usb_serial_port *port = urb->context;
unsigned char *data = urb->transfer_buffer;
unsigned long flags;
+ bool stopped = false;
+ int status = urb->status;
int i;

for (i = 0; i < ARRAY_SIZE(port->read_urbs); ++i) {
if (urb == port->read_urbs[i])
break;
}
- set_bit(i, &port->read_urbs_free);

dev_dbg(&port->dev, "%s - urb %d, len %d\n", __func__, i,
urb->actual_length);
- switch (urb->status) {
+ switch (status) {
case 0:
+ usb_serial_debug_data(&port->dev, __func__, urb->actual_length,
+ data);
+ port->serial->type->process_read_urb(urb);
break;
case -ENOENT:
case -ECONNRESET:
case -ESHUTDOWN:
dev_dbg(&port->dev, "%s - urb stopped: %d\n",
- __func__, urb->status);
- return;
+ __func__, status);
+ stopped = true;
+ break;
case -EPIPE:
dev_err(&port->dev, "%s - urb stopped: %d\n",
- __func__, urb->status);
- return;
+ __func__, status);
+ stopped = true;
+ break;
default:
dev_dbg(&port->dev, "%s - nonzero urb status: %d\n",
- __func__, urb->status);
- goto resubmit;
+ __func__, status);
+ break;
}

- usb_serial_debug_data(&port->dev, __func__, urb->actual_length, data);
- port->serial->type->process_read_urb(urb);
+ /*
+ * Make sure URB processing is done before marking as free to avoid
+ * racing with unthrottle() on another CPU. Matches the barriers
+ * implied by the test_and_clear_bit() in
+ * usb_serial_generic_submit_read_urb().
+ */
+ smp_mb__before_atomic();
+ set_bit(i, &port->read_urbs_free);
+ /*
+ * Make sure URB is marked as free before checking the throttled flag
+ * to avoid racing with unthrottle() on another CPU. Matches the
+ * smp_mb() in unthrottle().
+ */
+ smp_mb__after_atomic();
+
+ if (stopped)
+ return;

-resubmit:
/* Throttle the device if requested by tty */
spin_lock_irqsave(&port->lock, flags);
port->throttled = port->throttle_req;
@@ -399,6 +419,7 @@ void usb_serial_generic_write_bulk_callback(struct urb *urb)
{
unsigned long flags;
struct usb_serial_port *port = urb->context;
+ int status = urb->status;
int i;

for (i = 0; i < ARRAY_SIZE(port->write_urbs); ++i) {
@@ -410,22 +431,22 @@ void usb_serial_generic_write_bulk_callback(struct urb *urb)
set_bit(i, &port->write_urbs_free);
spin_unlock_irqrestore(&port->lock, flags);

- switch (urb->status) {
+ switch (status) {
case 0:
break;
case -ENOENT:
case -ECONNRESET:
case -ESHUTDOWN:
dev_dbg(&port->dev, "%s - urb stopped: %d\n",
- __func__, urb->status);
+ __func__, status);
return;
case -EPIPE:
dev_err_console(port, "%s - urb stopped: %d\n",
- __func__, urb->status);
+ __func__, status);
return;
default:
dev_err_console(port, "%s - nonzero urb status: %d\n",
- __func__, urb->status);
+ __func__, status);
goto resubmit;
}

@@ -456,6 +477,12 @@ void usb_serial_generic_unthrottle(struct tty_struct *tty)
port->throttled = port->throttle_req = 0;
spin_unlock_irq(&port->lock);

+ /*
+ * Matches the smp_mb__after_atomic() in
+ * usb_serial_generic_read_bulk_callback().
+ */
+ smp_mb();
+
if (was_throttled)
usb_serial_generic_submit_read_urbs(port, GFP_KERNEL);
}
diff --git a/drivers/usb/storage/realtek_cr.c b/drivers/usb/storage/realtek_cr.c
index 20433563a601..be432bec0c5b 100644
--- a/drivers/usb/storage/realtek_cr.c
+++ b/drivers/usb/storage/realtek_cr.c
@@ -772,18 +772,16 @@ static void rts51x_suspend_timer_fn(unsigned long data)
break;
case RTS51X_STAT_IDLE:
case RTS51X_STAT_SS:
- usb_stor_dbg(us, "RTS51X_STAT_SS, intf->pm_usage_cnt:%d, power.usage:%d\n",
- atomic_read(&us->pusb_intf->pm_usage_cnt),
+ usb_stor_dbg(us, "RTS51X_STAT_SS, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));

- if (atomic_read(&us->pusb_intf->pm_usage_cnt) > 0) {
+ if (atomic_read(&us->pusb_intf->dev.power.usage_count) > 0) {
usb_stor_dbg(us, "Ready to enter SS state\n");
rts51x_set_stat(chip, RTS51X_STAT_SS);
/* ignore mass storage interface's children */
pm_suspend_ignore_children(&us->pusb_intf->dev, true);
usb_autopm_put_interface_async(us->pusb_intf);
- usb_stor_dbg(us, "RTS51X_STAT_SS 01, intf->pm_usage_cnt:%d, power.usage:%d\n",
- atomic_read(&us->pusb_intf->pm_usage_cnt),
+ usb_stor_dbg(us, "RTS51X_STAT_SS 01, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));
}
break;
@@ -816,11 +814,10 @@ static void rts51x_invoke_transport(struct scsi_cmnd *srb, struct us_data *us)
int ret;

if (working_scsi(srb)) {
- usb_stor_dbg(us, "working scsi, intf->pm_usage_cnt:%d, power.usage:%d\n",
- atomic_read(&us->pusb_intf->pm_usage_cnt),
+ usb_stor_dbg(us, "working scsi, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));

- if (atomic_read(&us->pusb_intf->pm_usage_cnt) <= 0) {
+ if (atomic_read(&us->pusb_intf->dev.power.usage_count) <= 0) {
ret = usb_autopm_get_interface(us->pusb_intf);
usb_stor_dbg(us, "working scsi, ret=%d\n", ret);
}
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 6cac8f26b97a..e657b111b320 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -772,23 +772,33 @@ static int uas_slave_alloc(struct scsi_device *sdev)
{
struct uas_dev_info *devinfo =
(struct uas_dev_info *)sdev->host->hostdata;
+ int maxp;

sdev->hostdata = devinfo;

- /* USB has unusual DMA-alignment requirements: Although the
- * starting address of each scatter-gather element doesn't matter,
- * the length of each element except the last must be divisible
- * by the Bulk maxpacket value. There's currently no way to
- * express this by block-layer constraints, so we'll cop out
- * and simply require addresses to be aligned at 512-byte
- * boundaries. This is okay since most block I/O involves
- * hardware sectors that are multiples of 512 bytes in length,
- * and since host controllers up through USB 2.0 have maxpacket
- * values no larger than 512.
- *
- * But it doesn't suffice for Wireless USB, where Bulk maxpacket
- * values can be as large as 2048. To make that work properly
- * will require changes to the block layer.
+ /*
+ * We have two requirements here. We must satisfy the requirements
+ * of the physical HC and the demands of the protocol, as we
+ * definitely want no additional memory allocation in this path
+ * ruling out using bounce buffers.
+ *
+ * For a transmission on USB to continue we must never send
+ * a package that is smaller than maxpacket. Hence the length of each
+ * scatterlist element except the last must be divisible by the
+ * Bulk maxpacket value.
+ * If the HC does not ensure that through SG,
+ * the upper layer must do that. We must assume nothing
+ * about the capabilities off the HC, so we use the most
+ * pessimistic requirement.
+ */
+
+ maxp = usb_maxpacket(devinfo->udev, devinfo->data_in_pipe, 0);
+ blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
+
+ /*
+ * The protocol has no requirements on alignment in the strict sense.
+ * Controllers may or may not have alignment restrictions.
+ * As this is not exported, we use an extremely conservative guess.
*/
blk_queue_update_dma_alignment(sdev->request_queue, (512 - 1));

diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c
index 56cacb68040c..808e3a317954 100644
--- a/drivers/usb/usbip/stub_rx.c
+++ b/drivers/usb/usbip/stub_rx.c
@@ -380,22 +380,10 @@ static int get_pipe(struct stub_device *sdev, struct usbip_header *pdu)
}

if (usb_endpoint_xfer_isoc(epd)) {
- /* validate packet size and number of packets */
- unsigned int maxp, packets, bytes;
-
-#define USB_EP_MAXP_MULT_SHIFT 11
-#define USB_EP_MAXP_MULT_MASK (3 << USB_EP_MAXP_MULT_SHIFT)
-#define USB_EP_MAXP_MULT(m) \
- (((m) & USB_EP_MAXP_MULT_MASK) >> USB_EP_MAXP_MULT_SHIFT)
-
- maxp = usb_endpoint_maxp(epd);
- maxp *= (USB_EP_MAXP_MULT(
- __le16_to_cpu(epd->wMaxPacketSize)) + 1);
- bytes = pdu->u.cmd_submit.transfer_buffer_length;
- packets = DIV_ROUND_UP(bytes, maxp);
-
+ /* validate number of packets */
if (pdu->u.cmd_submit.number_of_packets < 0 ||
- pdu->u.cmd_submit.number_of_packets > packets) {
+ pdu->u.cmd_submit.number_of_packets >
+ USBIP_MAX_ISO_PACKETS) {
dev_err(&sdev->udev->dev,
"CMD_SUBMIT: isoc invalid num packets %d\n",
pdu->u.cmd_submit.number_of_packets);
diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index 0fc5ace57c0e..af903aa4ad90 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -134,6 +134,13 @@ extern struct device_attribute dev_attr_usbip_debug;
#define USBIP_DIR_OUT 0x00
#define USBIP_DIR_IN 0x01

+/*
+ * Arbitrary limit for the maximum number of isochronous packets in an URB,
+ * compare for example the uhci_submit_isochronous function in
+ * drivers/usb/host/uhci-q.c
+ */
+#define USBIP_MAX_ISO_PACKETS 1024
+
/**
* struct usbip_header_basic - data pertinent to every request
* @command: the usbip request type
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b31b84f56e8f..47b229fa5e8e 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1191,11 +1191,11 @@ static void __init vfio_pci_fill_ids(void)
rc = pci_add_dynid(&vfio_pci_driver, vendor, device,
subvendor, subdevice, class, class_mask, 0);
if (rc)
- pr_warn("failed to add dynamic id [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x (%d)\n",
+ pr_warn("failed to add dynamic id [%04x:%04x[%04x:%04x]] class %#08x/%08x (%d)\n",
vendor, device, subvendor, subdevice,
class, class_mask, rc);
else
- pr_info("add [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x\n",
+ pr_info("add [%04x:%04x[%04x:%04x]] class %#08x/%08x\n",
vendor, device, subvendor, subdevice,
class, class_mask);
}
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2fa280671c1e..875634d0d020 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -53,10 +53,16 @@ module_param_named(disable_hugepages,
MODULE_PARM_DESC(disable_hugepages,
"Disable VFIO IOMMU support for IOMMU hugepages.");

+static unsigned int dma_entry_limit __read_mostly = U16_MAX;
+module_param_named(dma_entry_limit, dma_entry_limit, uint, 0644);
+MODULE_PARM_DESC(dma_entry_limit,
+ "Maximum number of user DMA mappings per container (65535).");
+
struct vfio_iommu {
struct list_head domain_list;
struct mutex lock;
struct rb_root dma_list;
+ unsigned int dma_avail;
bool v2;
bool nesting;
};
@@ -382,6 +388,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
vfio_unmap_unpin(iommu, dma);
vfio_unlink_dma(iommu, dma);
kfree(dma);
+ iommu->dma_avail++;
}

static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
@@ -582,12 +589,18 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
return -EEXIST;
}

+ if (!iommu->dma_avail) {
+ mutex_unlock(&iommu->lock);
+ return -ENOSPC;
+ }
+
dma = kzalloc(sizeof(*dma), GFP_KERNEL);
if (!dma) {
mutex_unlock(&iommu->lock);
return -ENOMEM;
}

+ iommu->dma_avail--;
dma->iova = iova;
dma->vaddr = vaddr;
dma->prot = prot;
@@ -903,6 +916,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)

INIT_LIST_HEAD(&iommu->domain_list);
iommu->dma_list = RB_ROOT;
+ iommu->dma_avail = dma_entry_limit;
mutex_init(&iommu->lock);

return iommu;
diff --git a/drivers/virt/fsl_hypervisor.c b/drivers/virt/fsl_hypervisor.c
index 590a0f51a249..9f96c7e61387 100644
--- a/drivers/virt/fsl_hypervisor.c
+++ b/drivers/virt/fsl_hypervisor.c
@@ -215,6 +215,9 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p)
* hypervisor.
*/
lb_offset = param.local_vaddr & (PAGE_SIZE - 1);
+ if (param.count == 0 ||
+ param.count > U64_MAX - lb_offset - PAGE_SIZE + 1)
+ return -EINVAL;
num_pages = (param.count + lb_offset + PAGE_SIZE - 1) >> PAGE_SHIFT;

/* Allocate the buffers we need */
@@ -335,8 +338,8 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
struct fsl_hv_ioctl_prop param;
char __user *upath, *upropname;
void __user *upropval;
- char *path = NULL, *propname = NULL;
- void *propval = NULL;
+ char *path, *propname;
+ void *propval;
int ret = 0;

/* Get the parameters from the user. */
@@ -348,32 +351,30 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
upropval = (void __user *)(uintptr_t)param.propval;

path = strndup_user(upath, FH_DTPROP_MAX_PATHLEN);
- if (IS_ERR(path)) {
- ret = PTR_ERR(path);
- goto out;
- }
+ if (IS_ERR(path))
+ return PTR_ERR(path);

propname = strndup_user(upropname, FH_DTPROP_MAX_PATHLEN);
if (IS_ERR(propname)) {
ret = PTR_ERR(propname);
- goto out;
+ goto err_free_path;
}

if (param.proplen > FH_DTPROP_MAX_PROPLEN) {
ret = -EINVAL;
- goto out;
+ goto err_free_propname;
}

propval = kmalloc(param.proplen, GFP_KERNEL);
if (!propval) {
ret = -ENOMEM;
- goto out;
+ goto err_free_propname;
}

if (set) {
if (copy_from_user(propval, upropval, param.proplen)) {
ret = -EFAULT;
- goto out;
+ goto err_free_propval;
}

param.ret = fh_partition_set_dtprop(param.handle,
@@ -392,7 +393,7 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
if (copy_to_user(upropval, propval, param.proplen) ||
put_user(param.proplen, &p->proplen)) {
ret = -EFAULT;
- goto out;
+ goto err_free_propval;
}
}
}
@@ -400,10 +401,12 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
if (put_user(param.ret, &p->ret))
ret = -EFAULT;

-out:
- kfree(path);
+err_free_propval:
kfree(propval);
+err_free_propname:
kfree(propname);
+err_free_path:
+ kfree(path);

return ret;
}
diff --git a/drivers/w1/masters/ds2490.c b/drivers/w1/masters/ds2490.c
index 59d74d1b47a8..2287e1be0e55 100644
--- a/drivers/w1/masters/ds2490.c
+++ b/drivers/w1/masters/ds2490.c
@@ -1039,15 +1039,15 @@ static int ds_probe(struct usb_interface *intf,
/* alternative 3, 1ms interrupt (greatly speeds search), 64 byte bulk */
alt = 3;
err = usb_set_interface(dev->udev,
- intf->altsetting[alt].desc.bInterfaceNumber, alt);
+ intf->cur_altsetting->desc.bInterfaceNumber, alt);
if (err) {
dev_err(&dev->udev->dev, "Failed to set alternative setting %d "
"for %d interface: err=%d.\n", alt,
- intf->altsetting[alt].desc.bInterfaceNumber, err);
+ intf->cur_altsetting->desc.bInterfaceNumber, err);
goto err_out_clear;
}

- iface_desc = &intf->altsetting[alt];
+ iface_desc = intf->cur_altsetting;
if (iface_desc->desc.bNumEndpoints != NUM_EP-1) {
pr_info("Num endpoints=%d. It is not DS9490R.\n",
iface_desc->desc.bNumEndpoints);
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index be7d187d53fd..d636e2660e62 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1288,6 +1288,7 @@ void ceph_dentry_lru_del(struct dentry *dn)
unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn)
{
struct ceph_inode_info *dci = ceph_inode(dir);
+ unsigned hash;

switch (dci->i_dir_layout.dl_dir_hash) {
case 0: /* for backward compat */
@@ -1295,8 +1296,11 @@ unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn)
return dn->d_name.hash;

default:
- return ceph_str_hash(dci->i_dir_layout.dl_dir_hash,
+ spin_lock(&dn->d_lock);
+ hash = ceph_str_hash(dci->i_dir_layout.dl_dir_hash,
dn->d_name.name, dn->d_name.len);
+ spin_unlock(&dn->d_lock);
+ return hash;
}
}

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 9f0d99094cc1..a663b676d566 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -474,6 +474,7 @@ static void ceph_i_callback(struct rcu_head *head)
struct inode *inode = container_of(head, struct inode, i_rcu);
struct ceph_inode_info *ci = ceph_inode(inode);

+ kfree(ci->i_symlink);
kmem_cache_free(ceph_inode_cachep, ci);
}

@@ -505,7 +506,6 @@ void ceph_destroy_inode(struct inode *inode)
ceph_put_snap_realm(mdsc, realm);
}

- kfree(ci->i_symlink);
while ((n = rb_first(&ci->i_fragtree)) != NULL) {
frag = rb_entry(n, struct ceph_inode_frag, node);
rb_erase(n, &ci->i_fragtree);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 35e6e0b2cf34..a5de8e22629b 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1198,6 +1198,15 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
list_add(&ci->i_prealloc_cap_flush->list, &to_remove);
ci->i_prealloc_cap_flush = NULL;
}
+
+ if (drop &&
+ ci->i_wrbuffer_ref_head == 0 &&
+ ci->i_wr_ref == 0 &&
+ ci->i_dirty_caps == 0 &&
+ ci->i_flushing_caps == 0) {
+ ceph_put_snap_context(ci->i_head_snapc);
+ ci->i_head_snapc = NULL;
+ }
}
spin_unlock(&ci->i_ceph_lock);
while (!list_empty(&to_remove)) {
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index a485d0cdc559..3d876a1cf567 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -567,7 +567,12 @@ void ceph_queue_cap_snap(struct ceph_inode_info *ci)
capsnap = NULL;

update_snapc:
- if (ci->i_head_snapc) {
+ if (ci->i_wrbuffer_ref_head == 0 &&
+ ci->i_wr_ref == 0 &&
+ ci->i_dirty_caps == 0 &&
+ ci->i_flushing_caps == 0) {
+ ci->i_head_snapc = NULL;
+ } else {
ci->i_head_snapc = ceph_get_snap_context(new_snapc);
dout(" new snapc is %p\n", new_snapc);
}
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index d8bd8dd36211..0f210cb5038a 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1669,6 +1669,10 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry,
if (rc == 0 || rc != -EBUSY)
goto do_rename_exit;

+ /* Don't fall back to using SMB on SMB 2+ mount */
+ if (server->vals->protocol_id != 0)
+ goto do_rename_exit;
+
/* open-file renames don't work across directories */
if (to_dentry->d_parent != from_dentry->d_parent)
goto do_rename_exit;
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 22fe11baef2b..3530e1c3ff56 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -164,19 +164,24 @@ static int debugfs_show_options(struct seq_file *m, struct dentry *root)
return 0;
}

-static void debugfs_evict_inode(struct inode *inode)
+static void debugfs_i_callback(struct rcu_head *head)
{
- truncate_inode_pages_final(&inode->i_data);
- clear_inode(inode);
+ struct inode *inode = container_of(head, struct inode, i_rcu);
if (S_ISLNK(inode->i_mode))
kfree(inode->i_link);
+ free_inode_nonrcu(inode);
+}
+
+static void debugfs_destroy_inode(struct inode *inode)
+{
+ call_rcu(&inode->i_rcu, debugfs_i_callback);
}

static const struct super_operations debugfs_super_operations = {
.statfs = simple_statfs,
.remount_fs = debugfs_remount,
.show_options = debugfs_show_options,
- .evict_inode = debugfs_evict_inode,
+ .destroy_inode = debugfs_destroy_inode,
};

static struct vfsmount *debugfs_automount(struct path *path)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index cefae2350da5..27c4e2ac39a9 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -745,11 +745,17 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb,
umode_t mode, dev_t dev)
{
struct inode *inode;
- struct resv_map *resv_map;
+ struct resv_map *resv_map = NULL;

- resv_map = resv_map_alloc();
- if (!resv_map)
- return NULL;
+ /*
+ * Reserve maps are only needed for inodes that can have associated
+ * page allocations.
+ */
+ if (S_ISREG(mode) || S_ISLNK(mode)) {
+ resv_map = resv_map_alloc();
+ if (!resv_map)
+ return NULL;
+ }

inode = new_inode(sb);
if (inode) {
@@ -790,8 +796,10 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb,
break;
}
lockdep_annotate_inode_mutex_key(inode);
- } else
- kref_put(&resv_map->refs, resv_map_release);
+ } else {
+ if (resv_map)
+ kref_put(&resv_map->refs, resv_map_release);
+ }

return inode;
}
diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c
index bfebbf13698c..5b52ea41b84f 100644
--- a/fs/jffs2/readinode.c
+++ b/fs/jffs2/readinode.c
@@ -1414,11 +1414,6 @@ void jffs2_do_clear_inode(struct jffs2_sb_info *c, struct jffs2_inode_info *f)

jffs2_kill_fragtree(&f->fragtree, deleted?c:NULL);

- if (f->target) {
- kfree(f->target);
- f->target = NULL;
- }
-
fds = f->dents;
while(fds) {
fd = fds;
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index 023e7f32ee1b..9fc297df8c75 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -47,7 +47,10 @@ static struct inode *jffs2_alloc_inode(struct super_block *sb)
static void jffs2_i_callback(struct rcu_head *head)
{
struct inode *inode = container_of(head, struct inode, i_rcu);
- kmem_cache_free(jffs2_inode_cachep, JFFS2_INODE_INFO(inode));
+ struct jffs2_inode_info *f = JFFS2_INODE_INFO(inode);
+
+ kfree(f->target);
+ kmem_cache_free(jffs2_inode_cachep, f);
}

static void jffs2_destroy_inode(struct inode *inode)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 9b42139a479b..dced329a8584 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2020,7 +2020,8 @@ static int nfs23_validate_mount_data(void *options,
memcpy(sap, &data->addr, sizeof(data->addr));
args->nfs_server.addrlen = sizeof(data->addr);
args->nfs_server.port = ntohs(data->addr.sin_port);
- if (!nfs_verify_server_address(sap))
+ if (sap->sa_family != AF_INET ||
+ !nfs_verify_server_address(sap))
goto out_no_address;

if (!(data->flags & NFS_MOUNT_TCP))
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 24ace275160c..4fa3f0ba9ab3 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -874,8 +874,9 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
cb->cb_seq_status = 1;
cb->cb_status = 0;
if (minorversion) {
- if (!nfsd41_cb_get_slot(clp, task))
+ if (!cb->cb_holds_slot && !nfsd41_cb_get_slot(clp, task))
return;
+ cb->cb_holds_slot = true;
}
rpc_call_start(task);
}
@@ -902,6 +903,9 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
return true;
}

+ if (!cb->cb_holds_slot)
+ goto need_restart;
+
switch (cb->cb_seq_status) {
case 0:
/*
@@ -939,6 +943,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
cb->cb_seq_status);
}

+ cb->cb_holds_slot = false;
clear_bit(0, &clp->cl_cb_slot_busy);
rpc_wake_up_next(&clp->cl_cb_waitq);
dprintk("%s: freed slot, new seqid=%d\n", __func__,
@@ -1146,6 +1151,7 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
cb->cb_seq_status = 1;
cb->cb_status = 0;
cb->cb_need_restart = false;
+ cb->cb_holds_slot = false;
}

void nfsd4_run_cb(struct nfsd4_callback *cb)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 86af697c21d3..2c26bedda7be 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -70,6 +70,7 @@ struct nfsd4_callback {
int cb_seq_status;
int cb_status;
bool cb_need_restart;
+ bool cb_holds_slot;
};

struct nfsd4_callback_ops {
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c7e32a891502..2eea16a81500 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1550,9 +1550,11 @@ static void drop_sysctl_table(struct ctl_table_header *header)
if (--header->nreg)
return;

- if (parent)
+ if (parent) {
put_links(header);
- start_unregistering(header);
+ start_unregistering(header);
+ }
+
if (!--header->count)
kfree_rcu(header, rcu);

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index defeaac0745f..e76d03f44c80 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -1,28 +1,9 @@
#ifndef _LINUX_BITOPS_H
#define _LINUX_BITOPS_H
#include <asm/types.h>
+#include <linux/bits.h>

-#ifdef __KERNEL__
-#define BIT(nr) (1UL << (nr))
-#define BIT_ULL(nr) (1ULL << (nr))
-#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
-#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
-#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG))
-#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG)
-#define BITS_PER_BYTE 8
#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
-#endif
-
-/*
- * Create a contiguous bitmask starting at bit position @l and ending at
- * position @h. For example
- * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000.
- */
-#define GENMASK(h, l) \
- (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
-
-#define GENMASK_ULL(h, l) \
- (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))

extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
diff --git a/include/linux/bits.h b/include/linux/bits.h
new file mode 100644
index 000000000000..2b7b532c1d51
--- /dev/null
+++ b/include/linux/bits.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_BITS_H
+#define __LINUX_BITS_H
+#include <asm/bitsperlong.h>
+
+#define BIT(nr) (1UL << (nr))
+#define BIT_ULL(nr) (1ULL << (nr))
+#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
+#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
+#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG))
+#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG)
+#define BITS_PER_BYTE 8
+
+/*
+ * Create a contiguous bitmask starting at bit position @l and ending at
+ * position @h. For example
+ * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000.
+ */
+#define GENMASK(h, l) \
+ (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
+
+#define GENMASK_ULL(h, l) \
+ (((~0ULL) - (1ULL << (l)) + 1) & \
+ (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))
+
+#endif /* __LINUX_BITS_H */
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 063c73ed6d78..664f892d6e73 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -50,6 +50,8 @@ extern ssize_t cpu_show_spec_store_bypass(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_l1tf(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_mds(struct device *dev,
+ struct device_attribute *attr, char *buf);

extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
@@ -294,4 +296,21 @@ bool cpu_wait_death(unsigned int cpu, int seconds);
bool cpu_report_death(void);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */

+/*
+ * These are used for a global "mitigations=" cmdline option for toggling
+ * optional CPU mitigations.
+ */
+enum cpu_mitigations {
+ CPU_MITIGATIONS_OFF,
+ CPU_MITIGATIONS_AUTO,
+};
+
+extern enum cpu_mitigations cpu_mitigations;
+
+/* mitigations=off */
+static inline bool cpu_mitigations_off(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_OFF;
+}
+
#endif /* _LINUX_CPU_H_ */
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 68904469fba1..2209eb0740b0 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -267,9 +267,15 @@ struct static_key_false {
#define DEFINE_STATIC_KEY_TRUE(name) \
struct static_key_true name = STATIC_KEY_TRUE_INIT

+#define DECLARE_STATIC_KEY_TRUE(name) \
+ extern struct static_key_true name
+
#define DEFINE_STATIC_KEY_FALSE(name) \
struct static_key_false name = STATIC_KEY_FALSE_INIT

+#define DECLARE_STATIC_KEY_FALSE(name) \
+ extern struct static_key_false name
+
extern bool ____wrong_branch_error(void);

#define static_key_enabled(x) \
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 81fdf4b8aba4..8b1e2bd46bb7 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -57,14 +57,17 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
#define PTRACE_MODE_READ 0x01
#define PTRACE_MODE_ATTACH 0x02
#define PTRACE_MODE_NOAUDIT 0x04
-#define PTRACE_MODE_FSCREDS 0x08
-#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_FSCREDS 0x08
+#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_SCHED 0x20
+#define PTRACE_MODE_IBPB 0x40

/* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
#define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
#define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)

/**
* ptrace_may_access - check whether the caller is permitted to access
@@ -82,6 +85,20 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
*/
extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);

+/**
+ * ptrace_may_access - check whether the caller is permitted to access
+ * a target task.
+ * @task: target task
+ * @mode: selects type of access and caller credentials
+ *
+ * Returns true on success, false on denial.
+ *
+ * Similar to ptrace_may_access(). Only to be called from context switch
+ * code. Does not call into audit and the regular LSM hooks due to locking
+ * constraints.
+ */
+extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
+
static inline int ptrace_reparented(struct task_struct *child)
{
return !same_thread_group(child->real_parent, child->parent);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 48a59f731406..a0b540f800d9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2169,6 +2169,8 @@ static inline void memalloc_noio_restore(unsigned int flags)
#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */
#define PFA_SPEC_SSB_DISABLE 4 /* Speculative Store Bypass disabled */
#define PFA_SPEC_SSB_FORCE_DISABLE 5 /* Speculative Store Bypass force disabled*/
+#define PFA_SPEC_IB_DISABLE 6 /* Indirect branch speculation restricted */
+#define PFA_SPEC_IB_FORCE_DISABLE 7 /* Indirect branch speculation permanently restricted */


#define TASK_PFA_TEST(name, func) \
@@ -2199,6 +2201,13 @@ TASK_PFA_CLEAR(SPEC_SSB_DISABLE, spec_ssb_disable)
TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)

+TASK_PFA_TEST(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_SET(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_CLEAR(SPEC_IB_DISABLE, spec_ib_disable)
+
+TASK_PFA_TEST(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+TASK_PFA_SET(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+
/*
* task->jobctl flags
*/
diff --git a/include/linux/sched/smt.h b/include/linux/sched/smt.h
new file mode 100644
index 000000000000..559ac4590593
--- /dev/null
+++ b/include/linux/sched/smt.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_SCHED_SMT_H
+#define _LINUX_SCHED_SMT_H
+
+#include <linux/atomic.h>
+
+#ifdef CONFIG_SCHED_SMT
+extern atomic_t sched_smt_present;
+
+static __always_inline bool sched_smt_active(void)
+{
+ return atomic_read(&sched_smt_present);
+}
+#else
+static inline bool sched_smt_active(void) { return false; }
+#endif
+
+void arch_smt_update(void);
+
+#endif
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 5c03ebc6dfa0..02bffcc611c3 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -127,7 +127,6 @@ enum usb_interface_condition {
* @dev: driver model's view of this device
* @usb_dev: if an interface is bound to the USB major, this will point
* to the sysfs representation for that device.
- * @pm_usage_cnt: PM usage counter for this interface
* @reset_ws: Used for scheduling resets from atomic context.
* @resetting_device: USB core reset the device, so use alt setting 0 as
* current; needs bandwidth alloc after reset.
@@ -184,7 +183,6 @@ struct usb_interface {

struct device dev; /* interface specific device info */
struct device *usb_dev;
- atomic_t pm_usage_cnt; /* usage counter for autosuspend */
struct work_struct reset_ws; /* for resets in atomic context */
};
#define to_usb_interface(d) container_of(d, struct usb_interface, dev)
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 18dd7a3caf2f..af032e5405f6 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -162,6 +162,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex,
const struct in6_addr *addr);
int ipv6_sock_mc_drop(struct sock *sk, int ifindex,
const struct in6_addr *addr);
+void __ipv6_sock_mc_close(struct sock *sk);
void ipv6_sock_mc_close(struct sock *sk);
bool inet6_mc_check(struct sock *sk, const struct in6_addr *mc_addr,
const struct in6_addr *src_addr);
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index 876688b5a356..7c0c83dfe86e 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -174,6 +174,9 @@ struct adv_info {

#define HCI_MAX_SHORT_NAME_LENGTH 10

+/* Min encryption key size to match with SMP */
+#define HCI_MIN_ENC_KEY_SIZE 7
+
/* Default LE RPA expiry time, 15 minutes */
#define HCI_DEFAULT_RPA_TIMEOUT (15 * 60)

diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 64776b72e1eb..64ec0d62e5f5 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -202,6 +202,7 @@ struct prctl_mm_map {
#define PR_SET_SPECULATION_CTRL 53
/* Speculation control variants */
# define PR_SPEC_STORE_BYPASS 0
+# define PR_SPEC_INDIRECT_BRANCH 1
/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
# define PR_SPEC_NOT_AFFECTED 0
# define PR_SPEC_PRCTL (1UL << 0)
diff --git a/init/main.c b/init/main.c
index 49926d95442f..e88c8cdef6a7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -538,6 +538,8 @@ asmlinkage __visible void __init start_kernel(void)
page_alloc_init();

pr_notice("Kernel command line: %s\n", boot_command_line);
+ /* parameters may set static keys */
+ jump_label_init();
parse_early_param();
after_dashes = parse_args("Booting kernel",
static_command_line, __start___param,
@@ -547,8 +549,6 @@ asmlinkage __visible void __init start_kernel(void)
parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
NULL, set_init_arg);

- jump_label_init();
-
/*
* These use large bootmem allocations and must precede
* kmem_cache_init()
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 42ce0b0ae5c5..3225c3a9d028 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -8,6 +8,7 @@
#include <linux/init.h>
#include <linux/notifier.h>
#include <linux/sched.h>
+#include <linux/sched/smt.h>
#include <linux/unistd.h>
#include <linux/cpu.h>
#include <linux/oom.h>
@@ -199,6 +200,12 @@ void cpu_hotplug_enable(void)
EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
#endif /* CONFIG_HOTPLUG_CPU */

+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { }
+
/* Need to know about CPUs going up/down? */
int register_cpu_notifier(struct notifier_block *nb)
{
@@ -434,6 +441,7 @@ out_release:
cpu_hotplug_done();
if (!err)
cpu_notify_nofail(CPU_POST_DEAD | mod, hcpu);
+ arch_smt_update();
return err;
}

@@ -537,7 +545,7 @@ out_notify:
__cpu_notify(CPU_UP_CANCELED | mod, hcpu, nr_calls, NULL);
out:
cpu_hotplug_done();
-
+ arch_smt_update();
return ret;
}

@@ -834,3 +842,16 @@ void init_cpu_online(const struct cpumask *src)
{
cpumask_copy(to_cpumask(cpu_online_bits), src);
}
+
+enum cpu_mitigations cpu_mitigations = CPU_MITIGATIONS_AUTO;
+
+static int __init mitigations_parse_cmdline(char *arg)
+{
+ if (!strcmp(arg, "off"))
+ cpu_mitigations = CPU_MITIGATIONS_OFF;
+ else if (!strcmp(arg, "auto"))
+ cpu_mitigations = CPU_MITIGATIONS_AUTO;
+
+ return 0;
+}
+early_param("mitigations", mitigations_parse_cmdline);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 83cea913983c..92c7eb1aeded 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -319,8 +319,10 @@ irq_set_affinity_notifier(unsigned int irq, struct irq_affinity_notify *notify)
desc->affinity_notify = notify;
raw_spin_unlock_irqrestore(&desc->lock, flags);

- if (old_notify)
+ if (old_notify) {
+ cancel_work_sync(&old_notify->work);
kref_put(&old_notify->kref, old_notify->release);
+ }

return 0;
}
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 5e2cd1030702..8303874c2a06 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -228,6 +228,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)

static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
{
+ if (mode & PTRACE_MODE_SCHED)
+ return false;
+
if (mode & PTRACE_MODE_NOAUDIT)
return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
else
@@ -295,9 +298,16 @@ ok:
!ptrace_has_cap(mm->user_ns, mode)))
return -EPERM;

+ if (mode & PTRACE_MODE_SCHED)
+ return 0;
return security_ptrace_access_check(task, mode);
}

+bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
+{
+ return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
+}
+
bool ptrace_may_access(struct task_struct *task, unsigned int mode)
{
int err;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d0618951014b..d35a7d528ea6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5610,6 +5610,10 @@ static void set_cpu_rq_start_time(void)
rq->age_stamp = sched_clock_cpu(cpu);
}

+#ifdef CONFIG_SCHED_SMT
+atomic_t sched_smt_present = ATOMIC_INIT(0);
+#endif
+
static int sched_cpu_active(struct notifier_block *nfb,
unsigned long action, void *hcpu)
{
@@ -5626,11 +5630,23 @@ static int sched_cpu_active(struct notifier_block *nfb,
* set_cpu_online(). But it might not yet have marked itself
* as active, which is essential from here on.
*/
+#ifdef CONFIG_SCHED_SMT
+ /*
+ * When going up, increment the number of cores with SMT present.
+ */
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ atomic_inc(&sched_smt_present);
+#endif
set_cpu_active(cpu, true);
stop_machine_unpark(cpu);
return NOTIFY_OK;

case CPU_DOWN_FAILED:
+#ifdef CONFIG_SCHED_SMT
+ /* Same as for CPU_ONLINE */
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ atomic_inc(&sched_smt_present);
+#endif
set_cpu_active(cpu, true);
return NOTIFY_OK;

@@ -5645,7 +5661,15 @@ static int sched_cpu_inactive(struct notifier_block *nfb,
switch (action & ~CPU_TASKS_FROZEN) {
case CPU_DOWN_PREPARE:
set_cpu_active((long)hcpu, false);
+#ifdef CONFIG_SCHED_SMT
+ /*
+ * When going down, decrement the number of cores with SMT present.
+ */
+ if (cpumask_weight(cpu_smt_mask((long)hcpu)) == 2)
+ atomic_dec(&sched_smt_present);
+#endif
return NOTIFY_OK;
+
default:
return NOTIFY_DONE;
}
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d706cf4fda99..75bfa23f97b4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1722,6 +1722,10 @@ static u64 numa_get_avg_runtime(struct task_struct *p, u64 *period)
if (p->last_task_numa_placement) {
delta = runtime - p->last_sum_exec_runtime;
*period = now - p->last_task_numa_placement;
+
+ /* Avoid time going backwards, prevent potential divide error: */
+ if (unlikely((s64)*period < 0))
+ *period = 0;
} else {
delta = p->se.avg.load_sum / p->se.load.weight;
*period = LOAD_AVG_MAX;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 6893ee31df4d..8b96df04ba78 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2,6 +2,7 @@
#include <linux/sched.h>
#include <linux/sched/sysctl.h>
#include <linux/sched/rt.h>
+#include <linux/sched/smt.h>
#include <linux/sched/deadline.h>
#include <linux/mutex.h>
#include <linux/spinlock.h>
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 1adecb4b87c8..7e4d715f9c22 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -417,7 +417,7 @@ static int __init init_tstats_procfs(void)
{
struct proc_dir_entry *pe;

- pe = proc_create("timer_stats", 0644, NULL, &tstats_fops);
+ pe = proc_create("timer_stats", 0600, NULL, &tstats_fops);
if (!pe)
return -ENOMEM;
return 0;
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5e091614fe39..1cf2402c6922 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -701,7 +701,7 @@ u64 ring_buffer_time_stamp(struct ring_buffer *buffer, int cpu)

preempt_disable_notrace();
time = rb_time_stamp(buffer);
- preempt_enable_no_resched_notrace();
+ preempt_enable_notrace();

return time;
}
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index ac9791dd4768..5139c4ebb96b 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -363,10 +363,12 @@ static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
ifrr.ifr_ifru = ifr->ifr_ifru;

switch (cmd) {
+ case SIOCSHWTSTAMP:
+ if (!net_eq(dev_net(dev), &init_net))
+ break;
case SIOCGMIIPHY:
case SIOCGMIIREG:
case SIOCSMIIREG:
- case SIOCSHWTSTAMP:
case SIOCGHWTSTAMP:
if (netif_device_present(real_dev) && ops->ndo_do_ioctl)
err = ops->ndo_do_ioctl(real_dev, &ifrr, cmd);
diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
index 80be0ee17ff3..83d4d574fa44 100644
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -1177,6 +1177,14 @@ int hci_conn_check_link_mode(struct hci_conn *conn)
!test_bit(HCI_CONN_ENCRYPT, &conn->flags))
return 0;

+ /* The minimum encryption key size needs to be enforced by the
+ * host stack before establishing any L2CAP connections. The
+ * specification in theory allows a minimum of 1, but to align
+ * BR/EDR and LE transports, a minimum of 7 is chosen.
+ */
+ if (conn->enc_key_size < HCI_MIN_ENC_KEY_SIZE)
+ return 0;
+
return 1;
}

diff --git a/net/bluetooth/hidp/sock.c b/net/bluetooth/hidp/sock.c
index 008ba439bd62..cc80c76177b6 100644
--- a/net/bluetooth/hidp/sock.c
+++ b/net/bluetooth/hidp/sock.c
@@ -76,6 +76,7 @@ static int hidp_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long
sockfd_put(csock);
return err;
}
+ ca.name[sizeof(ca.name)-1] = 0;

err = hidp_connection_add(&ca, csock, isock);
if (!err && copy_to_user(argp, &ca, sizeof(ca)))
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 50e84e634dfe..c7a281549d91 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -471,13 +471,15 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
call_netdevice_notifiers(NETDEV_JOIN, dev);

err = dev_set_allmulti(dev, 1);
- if (err)
- goto put_back;
+ if (err) {
+ kfree(p); /* kobject not yet init'd, manually free */
+ goto err1;
+ }

err = kobject_init_and_add(&p->kobj, &brport_ktype, &(dev->dev.kobj),
SYSFS_BRIDGE_PORT_ATTR);
if (err)
- goto err1;
+ goto err2;

err = br_sysfs_addif(p);
if (err)
@@ -551,12 +553,9 @@ err3:
sysfs_remove_link(br->ifobj, p->dev->name);
err2:
kobject_put(&p->kobj);
- p = NULL; /* kobject_put frees */
-err1:
dev_set_allmulti(dev, -1);
-put_back:
+err1:
dev_put(dev);
- kfree(p);
return err;
}

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 93b5525bcccf..2ae0451fd634 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -507,6 +507,7 @@ static unsigned int br_nf_pre_routing(void *priv,
nf_bridge->ipv4_daddr = ip_hdr(skb)->daddr;

skb->protocol = htons(ETH_P_IP);
+ skb->transport_header = skb->network_header + ip_hdr(skb)->ihl * 4;

NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING, state->net, state->sk, skb,
skb->dev, NULL,
diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c
index 69dfd212e50d..f94c83f5cc37 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -237,6 +237,8 @@ unsigned int br_nf_pre_routing_ipv6(void *priv,
nf_bridge->ipv6_daddr = ipv6_hdr(skb)->daddr;

skb->protocol = htons(ETH_P_IPV6);
+ skb->transport_header = skb->network_header + sizeof(struct ipv6hdr);
+
NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, state->net, state->sk, skb,
skb->dev, NULL,
br_nf_pre_routing_finish_ipv6);
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index f13402d407e4..1a87cf78fadc 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -2046,7 +2046,8 @@ static int ebt_size_mwt(struct compat_ebt_entry_mwt *match32,
if (match_kern)
match_kern->match_size = ret;

- if (WARN_ON(type == EBT_COMPAT_TARGET && size_left))
+ /* rule should have no remaining data after target */
+ if (type == EBT_COMPAT_TARGET && size_left)
return -EINVAL;

match32 = (struct compat_ebt_entry_mwt *) buf;
diff --git a/net/core/filter.c b/net/core/filter.c
index 1a9ded6af138..3c5f51198c41 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -742,6 +742,17 @@ static bool chk_code_allowed(u16 code_to_probe)
return codes[code_to_probe];
}

+static bool bpf_check_basics_ok(const struct sock_filter *filter,
+ unsigned int flen)
+{
+ if (filter == NULL)
+ return false;
+ if (flen == 0 || flen > BPF_MAXINSNS)
+ return false;
+
+ return true;
+}
+
/**
* bpf_check_classic - verify socket filter code
* @filter: filter to verify
@@ -762,9 +773,6 @@ static int bpf_check_classic(const struct sock_filter *filter,
bool anc_found;
int pc;

- if (flen == 0 || flen > BPF_MAXINSNS)
- return -EINVAL;
-
/* Check the filter code now */
for (pc = 0; pc < flen; pc++) {
const struct sock_filter *ftest = &filter[pc];
@@ -1057,7 +1065,7 @@ int bpf_prog_create(struct bpf_prog **pfp, struct sock_fprog_kern *fprog)
struct bpf_prog *fp;

/* Make sure new filter is there and in the right amounts. */
- if (fprog->filter == NULL)
+ if (!bpf_check_basics_ok(fprog->filter, fprog->len))
return -EINVAL;

fp = bpf_prog_alloc(bpf_prog_size(fprog->len), 0);
@@ -1104,7 +1112,7 @@ int bpf_prog_create_from_user(struct bpf_prog **pfp, struct sock_fprog *fprog,
int err;

/* Make sure new filter is there and in the right amounts. */
- if (fprog->filter == NULL)
+ if (!bpf_check_basics_ok(fprog->filter, fprog->len))
return -EINVAL;

fp = bpf_prog_alloc(bpf_prog_size(fprog->len), 0);
@@ -1184,7 +1192,6 @@ int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk,
bool locked)
{
unsigned int fsize = bpf_classic_proglen(fprog);
- unsigned int bpf_fsize = bpf_prog_size(fprog->len);
struct bpf_prog *prog;
int err;

@@ -1192,10 +1199,10 @@ int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk,
return -EPERM;

/* Make sure new filter is there and in the right amounts. */
- if (fprog->filter == NULL)
+ if (!bpf_check_basics_ok(fprog->filter, fprog->len))
return -EINVAL;

- prog = bpf_prog_alloc(bpf_fsize, 0);
+ prog = bpf_prog_alloc(bpf_prog_size(fprog->len), 0);
if (!prog)
return -ENOMEM;

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index c11bb6d2d00a..6d5a0a7ebe10 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -475,6 +475,7 @@ static void ip_copy_metadata(struct sk_buff *to, struct sk_buff *from)
to->pkt_type = from->pkt_type;
to->priority = from->priority;
to->protocol = from->protocol;
+ to->skb_iif = from->skb_iif;
skb_dst_drop(to);
skb_dst_copy(to, from);
to->dev = from->dev;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 4d3d4291c82f..e742323d69e1 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -167,6 +167,7 @@ static int icmp_filter(const struct sock *sk, const struct sk_buff *skb)
*/
static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)
{
+ int dif = inet_iif(skb);
struct sock *sk;
struct hlist_head *head;
int delivered = 0;
@@ -179,8 +180,7 @@ static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)

net = dev_net(skb->dev);
sk = __raw_v4_lookup(net, __sk_head(head), iph->protocol,
- iph->saddr, iph->daddr,
- skb->dev->ifindex);
+ iph->saddr, iph->daddr, dif);

while (sk) {
delivered = 1;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 1d580d290054..a58effba760a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1162,25 +1162,39 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie)
return dst;
}

-static void ipv4_link_failure(struct sk_buff *skb)
+static void ipv4_send_dest_unreach(struct sk_buff *skb)
{
struct ip_options opt;
- struct rtable *rt;
int res;

/* Recompile ip options since IPCB may not be valid anymore.
+ * Also check we have a reasonable ipv4 header.
*/
- memset(&opt, 0, sizeof(opt));
- opt.optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr);
+ if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) ||
+ ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5)
+ return;

- rcu_read_lock();
- res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
- rcu_read_unlock();
+ memset(&opt, 0, sizeof(opt));
+ if (ip_hdr(skb)->ihl > 5) {
+ if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4))
+ return;
+ opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);

- if (res)
- return;
+ rcu_read_lock();
+ res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
+ rcu_read_unlock();

+ if (res)
+ return;
+ }
__icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
+}
+
+static void ipv4_link_failure(struct sk_buff *skb)
+{
+ struct rtable *rt;
+
+ ipv4_send_dest_unreach(skb);

rt = skb_rtable(skb);
if (rt)
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index da90c74d12ef..167ca0fddf9e 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -42,6 +42,7 @@ static int tcp_syn_retries_min = 1;
static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
static int ip_ping_group_range_min[] = { 0, 0 };
static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
+static int one_day_secs = 24 * 3600;

/* Update system visible IP port range */
static void set_local_port_range(struct net *net, int range[2])
@@ -597,7 +598,9 @@ static struct ctl_table ipv4_table[] = {
.data = &sysctl_tcp_min_rtt_wlen,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &one_day_secs
},
{
.procname = "tcp_low_latency",
diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
index f3a0a9c0f61e..c6061f7343f1 100644
--- a/net/ipv6/ip6_flowlabel.c
+++ b/net/ipv6/ip6_flowlabel.c
@@ -94,15 +94,21 @@ static struct ip6_flowlabel *fl_lookup(struct net *net, __be32 label)
return fl;
}

+static void fl_free_rcu(struct rcu_head *head)
+{
+ struct ip6_flowlabel *fl = container_of(head, struct ip6_flowlabel, rcu);
+
+ if (fl->share == IPV6_FL_S_PROCESS)
+ put_pid(fl->owner.pid);
+ kfree(fl->opt);
+ kfree(fl);
+}
+

static void fl_free(struct ip6_flowlabel *fl)
{
- if (fl) {
- if (fl->share == IPV6_FL_S_PROCESS)
- put_pid(fl->owner.pid);
- kfree(fl->opt);
- kfree_rcu(fl, rcu);
- }
+ if (fl)
+ call_rcu(&fl->rcu, fl_free_rcu);
}

static void fl_release(struct ip6_flowlabel *fl)
@@ -633,9 +639,9 @@ recheck:
if (fl1->share == IPV6_FL_S_EXCL ||
fl1->share != fl->share ||
((fl1->share == IPV6_FL_S_PROCESS) &&
- (fl1->owner.pid == fl->owner.pid)) ||
+ (fl1->owner.pid != fl->owner.pid)) ||
((fl1->share == IPV6_FL_S_USER) &&
- uid_eq(fl1->owner.uid, fl->owner.uid)))
+ !uid_eq(fl1->owner.uid, fl->owner.uid)))
goto release;

err = -ENOMEM;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 8d11a034ca3f..71263754b19b 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -121,6 +121,7 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
static bool setsockopt_needs_rtnl(int optname)
{
switch (optname) {
+ case IPV6_ADDRFORM:
case IPV6_ADD_MEMBERSHIP:
case IPV6_DROP_MEMBERSHIP:
case IPV6_JOIN_ANYCAST:
@@ -199,7 +200,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
}

fl6_free_socklist(sk);
- ipv6_sock_mc_close(sk);
+ __ipv6_sock_mc_close(sk);

/*
* Sock is moving from IPv6 to IPv4 (sk_prot), so
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index a5ec9a0cbb80..976c8133a281 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -276,16 +276,14 @@ static struct inet6_dev *ip6_mc_find_dev_rcu(struct net *net,
return idev;
}

-void ipv6_sock_mc_close(struct sock *sk)
+void __ipv6_sock_mc_close(struct sock *sk)
{
struct ipv6_pinfo *np = inet6_sk(sk);
struct ipv6_mc_socklist *mc_lst;
struct net *net = sock_net(sk);

- if (!rcu_access_pointer(np->ipv6_mc_list))
- return;
+ ASSERT_RTNL();

- rtnl_lock();
while ((mc_lst = rtnl_dereference(np->ipv6_mc_list)) != NULL) {
struct net_device *dev;

@@ -303,8 +301,17 @@ void ipv6_sock_mc_close(struct sock *sk)

atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc);
kfree_rcu(mc_lst, rcu);
-
}
+}
+
+void ipv6_sock_mc_close(struct sock *sk)
+{
+ struct ipv6_pinfo *np = inet6_sk(sk);
+
+ if (!rcu_access_pointer(np->ipv6_mc_list))
+ return;
+ rtnl_lock();
+ __ipv6_sock_mc_close(sk);
rtnl_unlock();
}

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 77736190dc15..5039486c4f86 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1076,7 +1076,7 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
if (!tdev && tunnel->parms.link)
tdev = __dev_get_by_index(tunnel->net, tunnel->parms.link);

- if (tdev) {
+ if (tdev && !netif_is_l3_master(tdev)) {
int t_hlen = tunnel->hlen + sizeof(struct iphdr);

dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr);
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index ac212542a217..c4509a10ce52 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1484,7 +1484,7 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff *skb, int *related,
if (!cp) {
int v;

- if (!sysctl_schedule_icmp(ipvs))
+ if (ipip || !sysctl_schedule_icmp(ipvs))
return NF_ACCEPT;

if (!ip_vs_try_to_schedule(ipvs, AF_INET, skb, pd, &v, &cp, &ciph))
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index b6e72af15237..cdafbd38a456 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1699,7 +1699,7 @@ static int __init xt_init(void)
seqcount_init(&per_cpu(xt_recseq, i));
}

- xt = kmalloc(sizeof(struct xt_af) * NFPROTO_NUMPROTO, GFP_KERNEL);
+ xt = kcalloc(NFPROTO_NUMPROTO, sizeof(struct xt_af), GFP_KERNEL);
if (!xt)
return -ENOMEM;

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 7d93228ba1e1..c78bcc13ebab 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2490,8 +2490,8 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
void *ph;
DECLARE_SOCKADDR(struct sockaddr_ll *, saddr, msg->msg_name);
bool need_wait = !(msg->msg_flags & MSG_DONTWAIT);
+ unsigned char *addr = NULL;
int tp_len, size_max;
- unsigned char *addr;
int len_sum = 0;
int status = TP_STATUS_AVAILABLE;
int hlen, tlen;
@@ -2511,10 +2511,13 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
sll_addr)))
goto out;
proto = saddr->sll_protocol;
- addr = saddr->sll_halen ? saddr->sll_addr : NULL;
dev = dev_get_by_index(sock_net(&po->sk), saddr->sll_ifindex);
- if (addr && dev && saddr->sll_halen < dev->addr_len)
- goto out_put;
+ if (po->sk.sk_socket->type == SOCK_DGRAM) {
+ if (dev && msg->msg_namelen < dev->addr_len +
+ offsetof(struct sockaddr_ll, sll_addr))
+ goto out_put;
+ addr = saddr->sll_addr;
+ }
}

err = -ENXIO;
@@ -2652,7 +2655,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
struct sk_buff *skb;
struct net_device *dev;
__be16 proto;
- unsigned char *addr;
+ unsigned char *addr = NULL;
int err, reserve = 0;
struct sockcm_cookie sockc;
struct virtio_net_hdr vnet_hdr = { 0 };
@@ -2672,7 +2675,6 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
if (likely(saddr == NULL)) {
dev = packet_cached_dev_get(po);
proto = po->num;
- addr = NULL;
} else {
err = -EINVAL;
if (msg->msg_namelen < sizeof(struct sockaddr_ll))
@@ -2680,10 +2682,13 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
if (msg->msg_namelen < (saddr->sll_halen + offsetof(struct sockaddr_ll, sll_addr)))
goto out;
proto = saddr->sll_protocol;
- addr = saddr->sll_halen ? saddr->sll_addr : NULL;
dev = dev_get_by_index(sock_net(sk), saddr->sll_ifindex);
- if (addr && dev && saddr->sll_halen < dev->addr_len)
- goto out_unlock;
+ if (sock->type == SOCK_DGRAM) {
+ if (dev && msg->msg_namelen < dev->addr_len +
+ offsetof(struct sockaddr_ll, sll_addr))
+ goto out_unlock;
+ addr = saddr->sll_addr;
+ }
}

err = -ENXIO;
@@ -4518,14 +4523,29 @@ static void __exit packet_exit(void)

static int __init packet_init(void)
{
- int rc = proto_register(&packet_proto, 0);
+ int rc;

- if (rc != 0)
+ rc = proto_register(&packet_proto, 0);
+ if (rc)
goto out;
+ rc = sock_register(&packet_family_ops);
+ if (rc)
+ goto out_proto;
+ rc = register_pernet_subsys(&packet_net_ops);
+ if (rc)
+ goto out_sock;
+ rc = register_netdevice_notifier(&packet_netdev_notifier);
+ if (rc)
+ goto out_pernet;

- sock_register(&packet_family_ops);
- register_pernet_subsys(&packet_net_ops);
- register_netdevice_notifier(&packet_netdev_notifier);
+ return 0;
+
+out_pernet:
+ unregister_pernet_subsys(&packet_net_ops);
+out_sock:
+ sock_unregister(PF_PACKET);
+out_proto:
+ proto_unregister(&packet_proto);
out:
return rc;
}
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index af17b00145e1..a8ab98b53a3a 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -54,6 +54,7 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail)
h->last_refresh = now;
}

+static inline int cache_is_valid(struct cache_head *h);
static void cache_fresh_locked(struct cache_head *head, time_t expiry,
struct cache_detail *detail);
static void cache_fresh_unlocked(struct cache_head *head,
@@ -100,6 +101,8 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
if (cache_is_expired(detail, tmp)) {
hlist_del_init(&tmp->cache_list);
detail->entries --;
+ if (cache_is_valid(tmp) == -EAGAIN)
+ set_bit(CACHE_NEGATIVE, &tmp->flags);
cache_fresh_locked(tmp, 0, detail);
freeme = tmp;
break;
diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index e9653c42cdd1..8400211537a2 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -262,8 +262,14 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd,
if (msg->rep_type)
tipc_tlv_init(msg->rep, msg->rep_type);

- if (cmd->header)
- (*cmd->header)(msg);
+ if (cmd->header) {
+ err = (*cmd->header)(msg);
+ if (err) {
+ kfree_skb(msg->rep);
+ msg->rep = NULL;
+ return err;
+ }
+ }

arg = nlmsg_new(0, GFP_KERNEL);
if (!arg) {
@@ -382,7 +388,12 @@ static int tipc_nl_compat_bearer_enable(struct tipc_nl_compat_cmd_doit *cmd,
if (!bearer)
return -EMSGSIZE;

- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_bearer_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_BEARER_NAME);
if (!string_is_valid(b->name, len))
return -EINVAL;

@@ -727,7 +738,12 @@ static int tipc_nl_compat_link_set(struct tipc_nl_compat_cmd_doit *cmd,

lc = (struct tipc_link_config *)TLV_DATA(msg->req);

- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_link_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_LINK_NAME);
if (!string_is_valid(lc->name, len))
return -EINVAL;

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 5e9cf7d146f0..e61a5c29b08c 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -156,9 +156,7 @@ cc-ldoption = $(call try-run,\

# ld-option
# Usage: LDFLAGS += $(call ld-option, -X)
-ld-option = $(call try-run,\
- $(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -x c /dev/null -c -o "$$TMPO"; \
- $(LD) $(LDFLAGS) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
+ld-option = $(call try-run, $(LD) $(LDFLAGS) $(1) -v,$(1),$(2))

# ar-option
# Usage: KBUILD_ARFLAGS := $(call ar-option,D)
diff --git a/scripts/kconfig/lxdialog/inputbox.c b/scripts/kconfig/lxdialog/inputbox.c
index d58de1dc5360..510049a7bd1d 100644
--- a/scripts/kconfig/lxdialog/inputbox.c
+++ b/scripts/kconfig/lxdialog/inputbox.c
@@ -126,7 +126,8 @@ do_resize:
case KEY_DOWN:
break;
case KEY_BACKSPACE:
- case 127:
+ case 8: /* ^H */
+ case 127: /* ^? */
if (pos) {
wattrset(dialog, dlg.inputbox.atr);
if (input_x == 0) {
diff --git a/scripts/kconfig/nconf.c b/scripts/kconfig/nconf.c
index d42d534a66cd..f7049e288e93 100644
--- a/scripts/kconfig/nconf.c
+++ b/scripts/kconfig/nconf.c
@@ -1046,7 +1046,7 @@ static int do_match(int key, struct match_state *state, int *ans)
state->match_direction = FIND_NEXT_MATCH_UP;
*ans = get_mext_match(state->pattern,
state->match_direction);
- } else if (key == KEY_BACKSPACE || key == 127) {
+ } else if (key == KEY_BACKSPACE || key == 8 || key == 127) {
state->pattern[strlen(state->pattern)-1] = '\0';
adj_match_dir(&state->match_direction);
} else
diff --git a/scripts/kconfig/nconf.gui.c b/scripts/kconfig/nconf.gui.c
index 4b2f44c20caf..9a65035cf787 100644
--- a/scripts/kconfig/nconf.gui.c
+++ b/scripts/kconfig/nconf.gui.c
@@ -439,7 +439,8 @@ int dialog_inputbox(WINDOW *main_window,
case KEY_F(F_EXIT):
case KEY_F(F_BACK):
break;
- case 127:
+ case 8: /* ^H */
+ case 127: /* ^? */
case KEY_BACKSPACE:
if (cursor_position > 0) {
memmove(&result[cursor_position-1],
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 99212ff6a568..ab2759d88bc6 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -396,21 +396,43 @@ static int may_context_mount_inode_relabel(u32 sid,
return rc;
}

-static int selinux_is_sblabel_mnt(struct super_block *sb)
+static int selinux_is_genfs_special_handling(struct super_block *sb)
{
- struct superblock_security_struct *sbsec = sb->s_security;
-
- return sbsec->behavior == SECURITY_FS_USE_XATTR ||
- sbsec->behavior == SECURITY_FS_USE_TRANS ||
- sbsec->behavior == SECURITY_FS_USE_TASK ||
- sbsec->behavior == SECURITY_FS_USE_NATIVE ||
- /* Special handling. Genfs but also in-core setxattr handler */
- !strcmp(sb->s_type->name, "sysfs") ||
+ /* Special handling. Genfs but also in-core setxattr handler */
+ return !strcmp(sb->s_type->name, "sysfs") ||
!strcmp(sb->s_type->name, "pstore") ||
!strcmp(sb->s_type->name, "debugfs") ||
!strcmp(sb->s_type->name, "rootfs");
}

+static int selinux_is_sblabel_mnt(struct super_block *sb)
+{
+ struct superblock_security_struct *sbsec = sb->s_security;
+
+ /*
+ * IMPORTANT: Double-check logic in this function when adding a new
+ * SECURITY_FS_USE_* definition!
+ */
+ BUILD_BUG_ON(SECURITY_FS_USE_MAX != 7);
+
+ switch (sbsec->behavior) {
+ case SECURITY_FS_USE_XATTR:
+ case SECURITY_FS_USE_TRANS:
+ case SECURITY_FS_USE_TASK:
+ case SECURITY_FS_USE_NATIVE:
+ return 1;
+
+ case SECURITY_FS_USE_GENFS:
+ return selinux_is_genfs_special_handling(sb);
+
+ /* Never allow relabeling on context mounts */
+ case SECURITY_FS_USE_MNTPOINT:
+ case SECURITY_FS_USE_NONE:
+ default:
+ return 0;
+ }
+}
+
static int sb_finish_set_opts(struct super_block *sb)
{
struct superblock_security_struct *sbsec = sb->s_security;
diff --git a/sound/soc/codecs/cs4270.c b/sound/soc/codecs/cs4270.c
index 3670086b9227..f273533c6653 100644
--- a/sound/soc/codecs/cs4270.c
+++ b/sound/soc/codecs/cs4270.c
@@ -641,6 +641,7 @@ static const struct regmap_config cs4270_regmap = {
.reg_defaults = cs4270_reg_defaults,
.num_reg_defaults = ARRAY_SIZE(cs4270_reg_defaults),
.cache_type = REGCACHE_RBTREE,
+ .write_flag_mask = CS4270_I2C_INCR,

.readable_reg = cs4270_reg_is_readable,
.volatile_reg = cs4270_reg_is_volatile,
diff --git a/sound/soc/codecs/tlv320aic32x4.c b/sound/soc/codecs/tlv320aic32x4.c
index f2d3191961e1..714bd0e3fc71 100644
--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -234,6 +234,8 @@ static const struct snd_soc_dapm_widget aic32x4_dapm_widgets[] = {
SND_SOC_DAPM_INPUT("IN2_R"),
SND_SOC_DAPM_INPUT("IN3_L"),
SND_SOC_DAPM_INPUT("IN3_R"),
+ SND_SOC_DAPM_INPUT("CM_L"),
+ SND_SOC_DAPM_INPUT("CM_R"),
};

static const struct snd_soc_dapm_route aic32x4_dapm_routes[] = {
diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c
index c9452e02e0dd..c0a50ecb6dbd 100644
--- a/sound/soc/intel/common/sst-dsp.c
+++ b/sound/soc/intel/common/sst-dsp.c
@@ -463,11 +463,15 @@ struct sst_dsp *sst_dsp_new(struct device *dev,
goto irq_err;

err = sst_dma_new(sst);
- if (err)
- dev_warn(dev, "sst_dma_new failed %d\n", err);
+ if (err) {
+ dev_err(dev, "sst_dma_new failed %d\n", err);
+ goto dma_err;
+ }

return sst;

+dma_err:
+ free_irq(sst->irq, sst);
irq_err:
if (sst->ops->free)
sst->ops->free(sst);
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index f99eb8f44282..1c0d44c86c01 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -882,10 +882,13 @@ static int soc_pcm_hw_params(struct snd_pcm_substream *substream,
codec_params = *params;

/* fixup params based on TDM slot masks */
- if (codec_dai->tx_mask)
+ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK &&
+ codec_dai->tx_mask)
soc_pcm_codec_params_fixup(&codec_params,
codec_dai->tx_mask);
- if (codec_dai->rx_mask)
+
+ if (substream->stream == SNDRV_PCM_STREAM_CAPTURE &&
+ codec_dai->rx_mask)
soc_pcm_codec_params_fixup(&codec_params,
codec_dai->rx_mask);

diff --git a/sound/usb/line6/driver.c b/sound/usb/line6/driver.c
index be78078a10ba..954dc4423cb0 100644
--- a/sound/usb/line6/driver.c
+++ b/sound/usb/line6/driver.c
@@ -307,12 +307,16 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data,
{
struct usb_device *usbdev = line6->usbdev;
int ret;
- unsigned char len;
+ unsigned char *len;
unsigned count;

if (address > 0xffff || datalen > 0xff)
return -EINVAL;

+ len = kmalloc(sizeof(*len), GFP_KERNEL);
+ if (!len)
+ return -ENOMEM;
+
/* query the serial number: */
ret = usb_control_msg(usbdev, usb_sndctrlpipe(usbdev, 0), 0x67,
USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT,
@@ -321,7 +325,7 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data,

if (ret < 0) {
dev_err(line6->ifcdev, "read request failed (error %d)\n", ret);
- return ret;
+ goto exit;
}

/* Wait for data length. We'll get 0xff until length arrives. */
@@ -331,28 +335,29 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data,
ret = usb_control_msg(usbdev, usb_rcvctrlpipe(usbdev, 0), 0x67,
USB_TYPE_VENDOR | USB_RECIP_DEVICE |
USB_DIR_IN,
- 0x0012, 0x0000, &len, 1,
+ 0x0012, 0x0000, len, 1,
LINE6_TIMEOUT * HZ);
if (ret < 0) {
dev_err(line6->ifcdev,
"receive length failed (error %d)\n", ret);
- return ret;
+ goto exit;
}

- if (len != 0xff)
+ if (*len != 0xff)
break;
}

- if (len == 0xff) {
+ ret = -EIO;
+ if (*len == 0xff) {
dev_err(line6->ifcdev, "read failed after %d retries\n",
count);
- return -EIO;
- } else if (len != datalen) {
+ goto exit;
+ } else if (*len != datalen) {
/* should be equal or something went wrong */
dev_err(line6->ifcdev,
"length mismatch (expected %d, got %d)\n",
- (int)datalen, (int)len);
- return -EIO;
+ (int)datalen, (int)*len);
+ goto exit;
}

/* receive the result: */
@@ -361,12 +366,12 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data,
0x0013, 0x0000, data, datalen,
LINE6_TIMEOUT * HZ);

- if (ret < 0) {
+ if (ret < 0)
dev_err(line6->ifcdev, "read failed (error %d)\n", ret);
- return ret;
- }

- return 0;
+exit:
+ kfree(len);
+ return ret;
}
EXPORT_SYMBOL_GPL(line6_read_data);

@@ -378,12 +383,16 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data,
{
struct usb_device *usbdev = line6->usbdev;
int ret;
- unsigned char status;
+ unsigned char *status;
int count;

if (address > 0xffff || datalen > 0xffff)
return -EINVAL;

+ status = kmalloc(sizeof(*status), GFP_KERNEL);
+ if (!status)
+ return -ENOMEM;
+
ret = usb_control_msg(usbdev, usb_sndctrlpipe(usbdev, 0), 0x67,
USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT,
0x0022, address, data, datalen,
@@ -392,7 +401,7 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data,
if (ret < 0) {
dev_err(line6->ifcdev,
"write request failed (error %d)\n", ret);
- return ret;
+ goto exit;
}

for (count = 0; count < LINE6_READ_WRITE_MAX_RETRIES; count++) {
@@ -403,28 +412,29 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data,
USB_TYPE_VENDOR | USB_RECIP_DEVICE |
USB_DIR_IN,
0x0012, 0x0000,
- &status, 1, LINE6_TIMEOUT * HZ);
+ status, 1, LINE6_TIMEOUT * HZ);

if (ret < 0) {
dev_err(line6->ifcdev,
"receiving status failed (error %d)\n", ret);
- return ret;
+ goto exit;
}

- if (status != 0xff)
+ if (*status != 0xff)
break;
}

- if (status == 0xff) {
+ if (*status == 0xff) {
dev_err(line6->ifcdev, "write failed after %d retries\n",
count);
- return -EIO;
- } else if (status != 0) {
+ ret = -EIO;
+ } else if (*status != 0) {
dev_err(line6->ifcdev, "write failed (error %d)\n", ret);
- return -EIO;
+ ret = -EIO;
}
-
- return 0;
+exit:
+ kfree(status);
+ return ret;
}
EXPORT_SYMBOL_GPL(line6_write_data);

diff --git a/sound/usb/line6/toneport.c b/sound/usb/line6/toneport.c
index 6d4c50c9b17d..5512b3d532e7 100644
--- a/sound/usb/line6/toneport.c
+++ b/sound/usb/line6/toneport.c
@@ -365,15 +365,20 @@ static bool toneport_has_source_select(struct usb_line6_toneport *toneport)
/*
Setup Toneport device.
*/
-static void toneport_setup(struct usb_line6_toneport *toneport)
+static int toneport_setup(struct usb_line6_toneport *toneport)
{
- int ticks;
+ int *ticks;
struct usb_line6 *line6 = &toneport->line6;
struct usb_device *usbdev = line6->usbdev;

+ ticks = kmalloc(sizeof(*ticks), GFP_KERNEL);
+ if (!ticks)
+ return -ENOMEM;
+
/* sync time on device with host: */
- ticks = (int)get_seconds();
- line6_write_data(line6, 0x80c6, &ticks, 4);
+ *ticks = (int)get_seconds();
+ line6_write_data(line6, 0x80c6, ticks, 4);
+ kfree(ticks);

/* enable device: */
toneport_send_cmd(usbdev, 0x0301, 0x0000);
@@ -388,6 +393,7 @@ static void toneport_setup(struct usb_line6_toneport *toneport)
toneport_update_led(toneport);

mod_timer(&toneport->timer, jiffies + TONEPORT_PCM_DELAY * HZ);
+ return 0;
}

/*
@@ -451,7 +457,9 @@ static int toneport_init(struct usb_line6 *line6,
return err;
}

- toneport_setup(toneport);
+ err = toneport_setup(toneport);
+ if (err)
+ return err;

/* register audio system: */
return snd_card_register(line6->card);
@@ -463,7 +471,11 @@ static int toneport_init(struct usb_line6 *line6,
*/
static int toneport_reset_resume(struct usb_interface *interface)
{
- toneport_setup(usb_get_intfdata(interface));
+ int err;
+
+ err = toneport_setup(usb_get_intfdata(interface));
+ if (err)
+ return err;
return line6_resume(interface);
}
#endif
diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 743746a3c50d..df3c73e9dea4 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2201,7 +2201,7 @@ eval_type_str(unsigned long long val, const char *type, int pointer)
return val & 0xffffffff;

if (strcmp(type, "u64") == 0 ||
- strcmp(type, "s64"))
+ strcmp(type, "s64") == 0)
return val;

if (strcmp(type, "s8") == 0)
diff --git a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile
index e367b1a85d70..3c04e2a85599 100644
--- a/tools/power/x86/turbostat/Makefile
+++ b/tools/power/x86/turbostat/Makefile
@@ -8,7 +8,7 @@ ifeq ("$(origin O)", "command line")
endif

turbostat : turbostat.c
-CFLAGS += -Wall
+CFLAGS += -Wall -I../../../include
CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'

%: %.c
diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests
index 16058bbea7a8..c195b4478662 100755
--- a/tools/testing/selftests/net/run_netsocktests
+++ b/tools/testing/selftests/net/run_netsocktests
@@ -6,7 +6,7 @@ echo "--------------------"
./socket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ exit 1
else
echo "[PASS]"
fi
-