Re: [PATCH 1/4] arm64: hyperv: Add core Hyper-V include files
From: Will Deacon
Date: Fri Dec 07 2018 - 08:42:18 EST
Hi all,
On Thu, Nov 22, 2018 at 03:10:56AM +0000, kys@xxxxxxxxxxxxxxxxx wrote:
> From: Michael Kelley <mikelley@xxxxxxxxxxxxx>
>
> hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level
> Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64,
> and Hyper-V has not separated out the architecture-dependent parts into
> x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64
> that is not yet formally published. The TLFS is available here:
When do you plan to publish the spec? It's pretty hard to review this stuff
without knowing what it's supposed to look like.
> docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
>
> mshyperv.h defines Linux-specific structures and routines for
> interacting with Hyper-V. It is split into an ARM64 specific file
> and an architecture independent file in include/asm-generic.
>
> Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> ---
> MAINTAINERS | 3 +
> arch/arm64/include/asm/hyperv-tlfs.h | 338 +++++++++++++++++++++++++++
> arch/arm64/include/asm/mshyperv.h | 116 +++++++++
> include/asm-generic/mshyperv.h | 240 +++++++++++++++++++
> 4 files changed, 697 insertions(+)
> create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h
> create mode 100644 arch/arm64/include/asm/mshyperv.h
> create mode 100644 include/asm-generic/mshyperv.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f4855974f325..72f19cef4c48 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6835,6 +6835,8 @@ F: arch/x86/include/asm/trace/hyperv.h
> F: arch/x86/include/asm/hyperv-tlfs.h
> F: arch/x86/kernel/cpu/mshyperv.c
> F: arch/x86/hyperv
> +F: arch/arm64/include/asm/hyperv-tlfs.h
> +F: arch/arm64/include/asm/mshyperv.h
> F: drivers/hid/hid-hyperv.c
> F: drivers/hv/
> F: drivers/input/serio/hyperv-keyboard.c
> @@ -6846,6 +6848,7 @@ F: drivers/video/fbdev/hyperv_fb.c
> F: net/vmw_vsock/hyperv_transport.c
> F: include/linux/hyperv.h
> F: include/uapi/linux/hyperv.h
> +F: include/asm-generic/mshyperv.h
> F: tools/hv/
> F: Documentation/ABI/stable/sysfs-bus-vmbus
>
> diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h
> new file mode 100644
> index 000000000000..924e37600e92
> --- /dev/null
> +++ b/arch/arm64/include/asm/hyperv-tlfs.h
> @@ -0,0 +1,338 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * This file contains definitions from the Hyper-V Hypervisor Top-Level
> + * Functional Specification (TLFS):
> + * https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fvirtualization%2Fhyper-v-on-windows%2Freference%2Ftlfs&data=02%7C01%7Ckys%40microsoft.com%7Cc831a45fd63e4a4b083908d641216aa8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636768009113747528&sdata=jRSrs9ZWXdmeS7LQUEpoSyUfBS7a5KLYy%2FolFdE2tI0%3D&reserved=0
As mentioned elsewhere, please use a better link here and drop the license
boilerplate below.
> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@xxxxxxxxxxxxx>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT. See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_ARM64_HYPERV_H
> +#define _ASM_ARM64_HYPERV_H
__ASM_HYPER_V_H please
> +
> +#include <linux/types.h>
> +
> +/*
> + * These Hyper-V registers provide information equivalent to the CPUID
> + * instruction on x86/x64.
> + */
> +#define HV_REGISTER_HYPERVISOR_VERSION 0x00000100 /*CPUID 0x40000002 */
> +#define HV_REGISTER_PRIVILEGES_AND_FEATURES 0x00000200 /*CPUID 0x40000003 */
> +#define HV_REGISTER_FEATURES 0x00000201 /*CPUID 0x40000004 */
> +#define HV_REGISTER_IMPLEMENTATION_LIMITS 0x00000202 /*CPUID 0x40000005 */
> +#define HV_ARM64_REGISTER_INTERFACE_VERSION 0x00090006 /*CPUID 0x40000001 */
> +
> +/*
> + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a
> + * 128-bit value with flags indicating which features are available to the
> + * partition based upon the current partition privileges. The 128-bit
> + * value is broken up with different portions stored in different 32-bit
> + * fields in the ms_hyperv structure.
> + */
> +
> +/* Partition Reference Counter available*/
> +#define HV_MSR_TIME_REF_COUNT_AVAILABLE (1 << 1)
> +
> +/*
> + * Synthetic Timers available
> + */
> +#define HV_MSR_SYNTIMER_AVAILABLE (1 << 3)
> +
> +/* Frequency MSRs available */
> +#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLE (1 << 8)
> +
> +/* Reference TSC available */
> +#define HV_MSR_REFERENCE_TSC_AVAILABLE (1 << 9)
> +
> +/* Crash MSR available */
> +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE (1 << 10)
> +
> +
> +/*
> + * This group of flags is in the high order 64-bits of the returned
> + * 128-bit value.
> + */
> +
> +/* STIMER direct mode is available */
> +#define HV_STIMER_DIRECT_MODE_AVAILABLE (1 << 19)
> +
> +/*
> + * Implementation recommendations in register
> + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor
> + * recommends the OS implement for optimal performance.
> + */
> +
> +/*
> + * Recommend not using Auto EOI
> + */
> +#define HV_DEPRECATING_AEOI_RECOMMENDED (1 << 9)
> +
> +/*
> + * Synthetic register definitions equivalent to MSRs on x86/x64
> + */
> +#define HV_REGISTER_CRASH_P0 0x00000210
> +#define HV_REGISTER_CRASH_P1 0x00000211
> +#define HV_REGISTER_CRASH_P2 0x00000212
> +#define HV_REGISTER_CRASH_P3 0x00000213
> +#define HV_REGISTER_CRASH_P4 0x00000214
> +#define HV_REGISTER_CRASH_CTL 0x00000215
> +
> +#define HV_REGISTER_GUEST_OSID 0x00090002
> +#define HV_REGISTER_VPINDEX 0x00090003
> +#define HV_REGISTER_TIME_REFCOUNT 0x00090004
> +#define HV_REGISTER_REFERENCE_TSC 0x00090017
> +
> +#define HV_REGISTER_SINT0 0x000A0000
> +#define HV_REGISTER_SINT1 0x000A0001
> +#define HV_REGISTER_SINT2 0x000A0002
> +#define HV_REGISTER_SINT3 0x000A0003
> +#define HV_REGISTER_SINT4 0x000A0004
> +#define HV_REGISTER_SINT5 0x000A0005
> +#define HV_REGISTER_SINT6 0x000A0006
> +#define HV_REGISTER_SINT7 0x000A0007
> +#define HV_REGISTER_SINT8 0x000A0008
> +#define HV_REGISTER_SINT9 0x000A0009
> +#define HV_REGISTER_SINT10 0x000A000A
> +#define HV_REGISTER_SINT11 0x000A000B
> +#define HV_REGISTER_SINT12 0x000A000C
> +#define HV_REGISTER_SINT13 0x000A000D
> +#define HV_REGISTER_SINT14 0x000A000E
> +#define HV_REGISTER_SINT15 0x000A000F
> +#define HV_REGISTER_SCONTROL 0x000A0010
> +#define HV_REGISTER_SVERSION 0x000A0011
> +#define HV_REGISTER_SIFP 0x000A0012
> +#define HV_REGISTER_SIPP 0x000A0013
> +#define HV_REGISTER_EOM 0x000A0014
> +#define HV_REGISTER_SIRBP 0x000A0015
> +
> +#define HV_REGISTER_STIMER0_CONFIG 0x000B0000
> +#define HV_REGISTER_STIMER0_COUNT 0x000B0001
> +#define HV_REGISTER_STIMER1_CONFIG 0x000B0002
> +#define HV_REGISTER_STIMER1_COUNT 0x000B0003
> +#define HV_REGISTER_STIMER2_CONFIG 0x000B0004
> +#define HV_REGISTER_STIMER2_COUNT 0x000B0005
> +#define HV_REGISTER_STIMER3_CONFIG 0x000B0006
> +#define HV_REGISTER_STIMER3_COUNT 0x000B0007
> +
> +/*
> + * Crash notification flags.
> + */
> +#define HV_CRASH_CTL_CRASH_NOTIFY_MSG BIT_ULL(62)
> +#define HV_CRASH_CTL_CRASH_NOTIFY BIT_ULL(63)
> +
> +/*
> + * The guest OS needs to register the guest ID with the hypervisor.
> + * The guest ID is a 64 bit entity and the structure of this ID is
> + * specified in the Hyper-V specification:
> + *
> + * msdn.microsoft.com/en-us/library/windows/hardware/ff542653%28v=vs.85%29.aspx
Dead link :(
> + *
> + * While the current guideline does not specify how Linux guest ID(s)
> + * need to be generated, our plan is to publish the guidelines for
> + * Linux and other guest operating systems that currently are hosted
> + * on Hyper-V. The implementation here conforms to this yet
> + * unpublished guidelines.
Again, any timeline on publishing this stuff? Right now, this is just a file
full of magic numbers and I'm not sure we're in a good position to maintain
it based on the idea that you have a cunning plan.
> + *
> + *
> + * Bit(s)
> + * 63 - Indicates if the OS is Open Source or not; 1 is Open Source
> + * 62:56 - Os Type; Linux is 0x100
> + * 55:48 - Distro specific identification
> + * 47:16 - Linux kernel version number
> + * 15:0 - Distro specific identification
> + *
> + *
> + */
> +#define HV_LINUX_VENDOR_ID 0x8100
> +
> +/* Declare the various hypercall operations. */
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE 0x0002
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST 0x0003
> +#define HVCALL_NOTIFY_LONG_SPIN_WAIT 0x0008
> +#define HVCALL_SEND_IPI 0x000b
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX 0x0013
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX 0x0014
> +#define HVCALL_SEND_IPI_EX 0x0015
> +#define HVCALL_GET_VP_REGISTERS 0x0050
> +#define HVCALL_SET_VP_REGISTERS 0x0051
> +#define HVCALL_POST_MESSAGE 0x005c
> +#define HVCALL_SIGNAL_EVENT 0x005d
> +#define HVCALL_RETARGET_INTERRUPT 0x007e
> +#define HVCALL_START_VIRTUAL_PROCESSOR 0x0099
> +#define HVCALL_GET_VP_INDEX_FROM_APICID 0x009a
This all sounds very x86y...
> +/* Declare standard hypercall field values. */
> +#define HV_PARTITION_ID_SELF ((u64)-1)
> +#define HV_VP_INDEX_SELF ((u32)-2)
> +
> +#define HV_HYPERCALL_FAST_BIT BIT(16)
> +#define HV_HYPERCALL_REP_COUNT_1 BIT_ULL(32)
> +#define HV_HYPERCALL_RESULT_MASK GENMASK_ULL(15, 0)
> +
> +/* Define the hypercall status result */
> +
> +union hv_hypercall_status {
> + u64 as_uint64;
> + struct {
> + u16 status;
> + u16 reserved;
> + u16 reps_completed; /* Low 12 bits */
> + u16 reserved2;
> + };
> +};
> +
> +/* hypercall status code */
> +#define HV_STATUS_SUCCESS 0
> +#define HV_STATUS_INVALID_HYPERCALL_CODE 2
> +#define HV_STATUS_INVALID_HYPERCALL_INPUT 3
> +#define HV_STATUS_INVALID_ALIGNMENT 4
> +#define HV_STATUS_INSUFFICIENT_MEMORY 11
> +#define HV_STATUS_INVALID_CONNECTION_ID 18
> +#define HV_STATUS_INSUFFICIENT_BUFFERS 19
> +
> +/* Define output layout for Get VP Register hypercall */
> +struct hv_get_vp_register_output {
> + u64 registervaluelow;
> + u64 registervaluehigh;
> +};
> +
> +#define HV_FLUSH_ALL_PROCESSORS BIT(0)
> +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES BIT(1)
> +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2)
> +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3)
> +
> +enum HV_GENERIC_SET_FORMAT {
> + HV_GENERIC_SET_SPARSE_4K,
> + HV_GENERIC_SET_ALL,
> +};
> +
> +/*
> + * The Hyper-V TimeRefCount register and the TSC
> + * page provide a guest VM clock with 100ns tick rate
> + */
> +#define HV_CLOCK_HZ (NSEC_PER_SEC/100)
> +
> +/*
> + * The fields in this structure are set by Hyper-V and read
> + * by the Linux guest. They should be accessed with READ_ONCE()
> + * so the compiler doesn't optimize in a way that will cause
> + * problems.
> + */
> +struct ms_hyperv_tsc_page {
> + u32 tsc_sequence;
> + u32 reserved1;
> + u64 tsc_scale;
> + s64 tsc_offset;
> + u64 reserved2[509];
> +};
It might be clearer to express this as a union with the page size:
struct ms_hyperv_tsc_page {
union {
struct {
__u32 tsc_sequence;
__u32 reserved1;
};
__u8 reserved2[HV_HYP_PAGE_SIZE];
};
};
What's the required alignment on this structure?
Also, do you intend for 32-bit guests to use this ABI as well?
> +
> +/* Define the number of synthetic interrupt sources. */
> +#define HV_SYNIC_SINT_COUNT (16)
> +/* Define the expected SynIC version. */
> +#define HV_SYNIC_VERSION_1 (0x1)
> +
> +#define HV_SYNIC_CONTROL_ENABLE (1ULL << 0)
> +#define HV_SYNIC_SIMP_ENABLE (1ULL << 0)
> +#define HV_SYNIC_SIEFP_ENABLE (1ULL << 0)
> +#define HV_SYNIC_SINT_MASKED (1ULL << 16)
> +#define HV_SYNIC_SINT_AUTO_EOI (1ULL << 17)
> +#define HV_SYNIC_SINT_VECTOR_MASK (0xFF)
> +
> +#define HV_SYNIC_STIMER_COUNT (4)
> +
> +/* Define synthetic interrupt controller message constants. */
> +#define HV_MESSAGE_SIZE (256)
> +#define HV_MESSAGE_PAYLOAD_BYTE_COUNT (240)
> +#define HV_MESSAGE_PAYLOAD_QWORD_COUNT (30)
> +
> +/* Define hypervisor message types. */
> +enum hv_message_type {
> + HVMSG_NONE = 0x00000000,
> +
> + /* Memory access messages. */
> + HVMSG_UNMAPPED_GPA = 0x80000000,
> + HVMSG_GPA_INTERCEPT = 0x80000001,
> +
> + /* Timer notification messages. */
> + HVMSG_TIMER_EXPIRED = 0x80000010,
> +
> + /* Error messages. */
> + HVMSG_INVALID_VP_REGISTER_VALUE = 0x80000020,
> + HVMSG_UNRECOVERABLE_EXCEPTION = 0x80000021,
> + HVMSG_UNSUPPORTED_FEATURE = 0x80000022,
> +
> + /* Trace buffer complete messages. */
> + HVMSG_EVENTLOG_BUFFERCOMPLETE = 0x80000040,
> +};
> +
> +/* Define synthetic interrupt controller message flags. */
> +union hv_message_flags {
> + __u8 asu8;
> + struct {
> + __u8 msg_pending:1;
> + __u8 reserved:7;
> + };
> +};
> +
> +/* Define port identifier type. */
> +union hv_port_id {
> + __u32 asu32;
> + struct {
> + __u32 id:24;
> + __u32 reserved:8;
> + } u;
> +};
Hmm, I thought bitfields weren't a good idea for this sort of thing. Isn't
it up to the compiler how they are laid out in memory?
> +
> +/* Define synthetic interrupt controller message header. */
> +struct hv_message_header {
> + __u32 message_type;
> + __u8 payload_size;
> + union hv_message_flags message_flags;
> + __u8 reserved[2];
> + union {
> + __u64 sender;
> + union hv_port_id port;
> + };
> +};
> +
> +/* Define synthetic interrupt controller message format. */
> +struct hv_message {
> + struct hv_message_header header;
> + union {
> + __u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
> + } u;
> +};
> +
> +/* Define the synthetic interrupt message page layout. */
> +struct hv_message_page {
> + struct hv_message sint_message[HV_SYNIC_SINT_COUNT];
> +};
> +
> +/* Define timer message payload structure. */
> +struct hv_timer_message_payload {
> + __u32 timer_index;
> + __u32 reserved;
> + __u64 expiration_time; /* When the timer expired */
> + __u64 delivery_time; /* When the message was delivered */
> +};
> +
> +#define HV_STIMER_ENABLE (1ULL << 0)
> +#define HV_STIMER_PERIODIC (1ULL << 1)
> +#define HV_STIMER_LAZY (1ULL << 2)
> +#define HV_STIMER_AUTOENABLE (1ULL << 3)
> +#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)
> +
> +#endif
> diff --git a/arch/arm64/include/asm/mshyperv.h b/arch/arm64/include/asm/mshyperv.h
> new file mode 100644
> index 000000000000..a87c431d58b3
> --- /dev/null
> +++ b/arch/arm64/include/asm/mshyperv.h
> @@ -0,0 +1,116 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Linux-specific definitions for managing interactions with Microsoft's
> + * Hyper-V hypervisor. The definitions in this file are specific to
> + * the ARM64 architecture. See include/asm-generic/mshyperv.h for
> + * definitions are that architecture independent.
> + *
> + * Definitions that are specified in the Hyper-V Top Level Functional
> + * Spec (TLFS) should not go in this file, but should instead go in
> + * hyperv-tlfs.h.
> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@xxxxxxxxxxxxx>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT. See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_ARM64_MSHYPERV_H
> +#define _ASM_ARM64_MSHYPERV_H
> +
> +#include <linux/types.h>
> +#include <linux/interrupt.h>
> +#include <linux/clocksource.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> +#include <asm/hyperv-tlfs.h>
> +
> +/*
> + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts
> + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying
> + * these values through ACPI, but there are no other interrupting
> + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now.
> + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64
> + * world that is used in architecture independent Hyper-V code.
> + */
> +#define HYPERVISOR_CALLBACK_VECTOR 16
> +#define HV_STIMER0_IRQNR 17
> +
> +extern u64 hv_do_hvc(u64 control, ...);
> +extern u64 hv_do_hvc_fast_get(u64 control, u64 input1, u64 input2, u64 input3,
> + struct hv_get_vp_register_output *output);
> +
> +/*
> + * Declare calls to get and set Hyper-V VP register values on ARM64, which
> + * requires a hypercall.
> + */
> +extern void hv_set_vpreg(u32 reg, u64 value);
> +extern u64 hv_get_vpreg(u32 reg);
> +extern void hv_get_vpreg_128(u32 reg, struct hv_get_vp_register_output *result);
> +
> +/*
> + * Use the Hyper-V provided stimer0 as the timer that is made
> + * available to the architecture independent Hyper-V drivers.
> + */
> +#define hv_init_timer(timer, tick) \
> + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick)
> +#define hv_init_timer_config(timer, val) \
> + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val)
> +#define hv_get_current_tick(tick) \
> + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT))
> +
> +#define hv_get_simp(val) (val = hv_get_vpreg(HV_REGISTER_SIPP))
> +#define hv_set_simp(val) hv_set_vpreg(HV_REGISTER_SIPP, val)
> +
> +#define hv_get_siefp(val) (val = hv_get_vpreg(HV_REGISTER_SIFP))
> +#define hv_set_siefp(val) hv_set_vpreg(HV_REGISTER_SIFP, val)
> +
> +#define hv_get_synic_state(val) (val = hv_get_vpreg(HV_REGISTER_SCONTROL))
> +#define hv_set_synic_state(val) hv_set_vpreg(HV_REGISTER_SCONTROL, val)
> +
> +#define hv_get_vp_index(index) (index = hv_get_vpreg(HV_REGISTER_VPINDEX))
> +
> +#define hv_signal_eom() hv_set_vpreg(HV_REGISTER_EOM, 0)
> +
> +/*
> + * Hyper-V SINT registers are numbered sequentially, so we can just
> + * add the SINT number to the register number of SINT0
> + */
> +#define hv_get_synint_state(sint_num, val) \
> + (val = hv_get_vpreg(HV_REGISTER_SINT0 + sint_num))
> +#define hv_set_synint_state(sint_num, val) \
> + hv_set_vpreg(HV_REGISTER_SINT0 + sint_num, val)
> +
> +#define hv_get_crash_ctl(val) \
> + (val = hv_get_vpreg(HV_REGISTER_CRASH_CTL))
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0)
> +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq)
> +#endif
> +
> +/* ARM64 specific code to read the hardware clock */
> +static inline u64 hv_read_hwclock(void)
> +{
> + u64 result;
> +
> + isb();
> + result = read_sysreg(cntvct_el0);
> + isb();
> +
> + return result;
> +}
Shouldn't you use arch_timer_read_counter() or arch_counter_get_cntvct()
instead?
> +#include <asm-generic/mshyperv.h>
> +
> +#endif
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> new file mode 100644
> index 000000000000..fbe1ec89c85a
> --- /dev/null
> +++ b/include/asm-generic/mshyperv.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Linux-specific definitions for managing interactions with Microsoft's
> + * Hyper-V hypervisor. The definitions in this file are architecture
> + * independent. See arch/<arch>/include/asm/mshyperv.h for definitions
> + * that are specific to architecture <arch>.
> + *
> + * Definitions that are specified in the Hyper-V Top Level Functional
> + * Spec (TLFS) should not go in this file, but should instead go in
> + * hyperv-tlfs.h.
> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@xxxxxxxxxxxxx>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT. See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_GENERIC_MSHYPERV_H
> +#define _ASM_GENERIC_MSHYPERV_H
> +
> +#include <linux/types.h>
> +#include <linux/interrupt.h>
> +#include <linux/clocksource.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> +#include <asm/hyperv-tlfs.h>
> +
> +/*
> + * Hyper-V always runs with a page size of 4096. These definitions
> + * are used when communicating with Hyper-V using guest physical
> + * pages and guest physical page addresses, since the guest page
> + * size may not be 4096 on ARM64.
> + */
> +#define HV_HYP_PAGE_SIZE 4096
> +#define HV_HYP_PAGE_SHIFT 12
Maybe define the page size in terms of the shift?
> +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1))
> +
> +
> +struct ms_hyperv_info {
> + u32 features;
> + u32 misc_features;
> + u32 hints;
> + u32 max_vp_index;
> + u32 max_lp_index;
> +};
What is the define endianness for in-memory structures shared with the
hypervisor?
> +extern struct ms_hyperv_info ms_hyperv;
> +
> +extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
> +extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
> +
> +/*
> + * The guest OS needs to register the guest ID with the hypervisor.
> + * The guest ID is a 64 bit entity and the structure of this ID is
> + * specified in the Hyper-V specification:
> + *
> + * msdn.microsoft.com/en-us/library/windows/hardware/ff542653%28v=vs.85%29.aspx
> + *
> + * While the current guideline does not specify how Linux guest ID(s)
> + * need to be generated, our plan is to publish the guidelines for
> + * Linux and other guest operating systems that currently are hosted
> + * on Hyper-V. The implementation here conforms to this yet
> + * unpublished guidelines.
> + *
> + *
> + * Bit(s)
> + * 63 - Indicates if the OS is Open Source or not; 1 is Open Source
> + * 62:56 - Os Type; Linux is 0x100
> + * 55:48 - Distro specific identification
> + * 47:16 - Linux kernel version number
> + * 15:0 - Distro specific identification
This looks familiar...
> + * Generate the guest ID based on the guideline described above.
> + */
> +
> +static inline __u64 generate_guest_id(__u64 d_info1, __u64 kernel_version,
> + __u64 d_info2)
> +{
> + __u64 guest_id = 0;
> +
> + guest_id = (((__u64)HV_LINUX_VENDOR_ID) << 48);
> + guest_id |= (d_info1 << 48);
> + guest_id |= (kernel_version << 16);
> + guest_id |= d_info2;
> +
> + return guest_id;
> +}
> +
> +
> +/* Free the message slot and signal end-of-message if required */
> +static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
> +{
> + /*
> + * On crash we're reading some other CPU's message page and we need
> + * to be careful: this other CPU may already had cleared the header
> + * and the host may already had delivered some other message there.
> + * In case we blindly write msg->header.message_type we're going
> + * to lose it. We can still lose a message of the same type but
> + * we count on the fact that there can only be one
> + * CHANNELMSG_UNLOAD_RESPONSE and we don't care about other messages
> + * on crash.
> + */
> + if (cmpxchg(&msg->header.message_type, old_msg_type,
> + HVMSG_NONE) != old_msg_type)
> + return;
> +
> + /*
> + * Make sure the write to MessageType (ie set to
> + * HVMSG_NONE) happens before we read the
> + * MessagePending and EOMing. Otherwise, the EOMing
> + * will not deliver any more messages since there is
> + * no empty slot
> + */
> + mb();
A successful cmpxchg() already implies full barriers, so this isn't needed.
I also wonder whether cmpxchg_acquire() would be sufficient here.
> +
> + if (msg->header.message_flags.msg_pending) {
> + /*
> + * This will cause message queue rescan to
> + * possibly deliver another msg from the
> + * hypervisor
> + */
> + hv_signal_eom();
> + }
> +}
> +
> +void hv_setup_vmbus_irq(void (*handler)(void));
> +void hv_remove_vmbus_irq(void);
> +void hv_enable_vmbus_irq(void);
> +void hv_disable_vmbus_irq(void);
> +
> +void hv_setup_kexec_handler(void (*handler)(void));
> +void hv_remove_kexec_handler(void);
> +void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs));
> +void hv_remove_crash_handler(void);
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +extern struct clocksource *hyperv_cs;
> +
> +/*
> + * Hypervisor's notion of virtual processor ID is different from
> + * Linux' notion of CPU ID. This information can only be retrieved
> + * in the context of the calling CPU. Setup a map for easy access
> + * to this information.
> + */
> +extern u32 *hv_vp_index;
> +extern u32 hv_max_vp_index;
> +
> +/* Sentinel value for an uninitialized entry in hv_vp_index array */
> +#define VP_INVAL U32_MAX
> +
> +/**
> + * hv_cpu_number_to_vp_number() - Map CPU to VP.
> + * @cpu_number: CPU number in Linux terms
> + *
> + * This function returns the mapping between the Linux processor
> + * number and the hypervisor's virtual processor number, useful
> + * in making hypercalls and such that talk about specific
> + * processors.
> + *
> + * Return: Virtual processor number in Hyper-V terms
> + */
> +static inline int hv_cpu_number_to_vp_number(int cpu_number)
> +{
> + return hv_vp_index[cpu_number];
> +}
> +
> +void hyperv_report_panic(struct pt_regs *regs, long err);
> +void hyperv_report_panic_msg(phys_addr_t pa, size_t size);
> +bool hv_is_hyperv_initialized(void);
> +void hyperv_cleanup(void);
> +#else /* CONFIG_HYPERV */
> +static inline bool hv_is_hyperv_initialized(void) { return false; }
> +static inline void hyperv_cleanup(void) {}
> +#endif /* CONFIG_HYPERV */
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +extern int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void));
> +extern void hv_remove_stimer0_irq(int irq);
> +#endif
> +
> +static inline u64 hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg,
> + u64 *cur_tsc)
> +{
> + u64 scale, offset;
> + u32 sequence;
> +
> + /*
> + * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> + * Top-Level Functional Specification. To get the reference time we
> + * must do the following:
> + * - READ ReferenceTscSequence
> + * A special '0' value indicates the time source is unreliable and we
> + * need to use something else.
> + * - ReferenceTime =
> + * ((HWclock val) * ReferenceTscScale) >> 64) + ReferenceTscOffset
> + * - READ ReferenceTscSequence again. In case its value has changed
> + * since our first reading we need to discard ReferenceTime and repeat
> + * the whole sequence as the hypervisor was updating the page in
> + * between.
> + */
> + do {
> + sequence = READ_ONCE(tsc_pg->tsc_sequence);
> + /*
> + * Make sure we read sequence before we read other values from
> + * TSC page.
> + */
> + smp_rmb();
> +
> + scale = READ_ONCE(tsc_pg->tsc_scale);
> + offset = READ_ONCE(tsc_pg->tsc_offset);
> + *cur_tsc = hv_read_hwclock();
> +
> + /*
> + * Make sure we read sequence after we read all other values
> + * from TSC page.
> + */
> + smp_rmb();
> +
> + } while (READ_ONCE(tsc_pg->tsc_sequence) != sequence);
Could you explain what the writer side looks like, please? I'm a bit
confused as to why we're not using the bottom bit of the sequence
number to detect a concurrent update.
Will