Writing a clocksource driver for Linux
Today we are going to write a clocksource [1] driver for Linux!
A clocksource in a Linux system is one of several parts of the kernel timekeeping abstractions. The clocksource is the the timeline of the Linux system and is the one you go to whenever you issue the command date. To do this, the clocksource should provide a monotonic, atomic counter that is as accurate as possible.
Timekeeping is hard [2], and not all clocks are of the same quality. The kenel of course wants to select the best possible clocksource available, thus every clock has to honestly specify its own quality in rating member of the struct clocksource [3].
The following intervals are possible:
- [1-99] These are very bad clocks that can only be used as a last resort or during boot up when there are no better clocks available.
- [100-199] Clocks that fit for real use, but are not desireble if something better can be found.
- [200-299] Good usable clocks.
- [399-399] Reasonable fast and accurate clocks.
- [400-499] Perfect clocks! A must-use where available.
Besidedes the clocksource, this driver will also register itself as a sched_clock. This clock returns the number of nanoseconds since the system was started. If the clocksource does not implement this function then the system jiffy counter will be used as sched_clock().
Background
I'm working in a project where we have an ARM SoC connected to an FPGA on the AEMIF interface. This FPGA has implemented a free running clock that we intend to use as clocksource for the Linux system. The clock is not perfect in terms of accuracy but good enough for our application. The motivation behind this implementation is that this clock is ticking even when the linux system is in deepsleep, where all internal PLLs are gated.
Unfortunately, I cannot share any more details about the platform nor the implementation in the FPGA, but that is not needed to understand the clocksource driver. Since I can't reveal which client it is, I'll call the driver mfoc-clocksource.
Overview of the driver
The FPGA is connected to the memory bus so driver will request_mem_region() and ioremap() the region to make it accessible for the device driver. The call to devm_platform_ioremap_resource() will do this in a device managed manner.
Here is an overview of the functions of the driver:
The driver is quite small, but the only thing it has to do is to read the timer registers and report it back to the timekeeping subsystem.
Implementation
Global data
The driver initialize some global data structures:
1struct mfoc_clocksource {
2 void __iomem *regs;
3 struct clocksource cs;
4 struct clk *clk;
5};
6
7static struct mfoc_clocksource *mfoc;
The structure is global and not per device for two reasons:
- There can only be one instance of this device anyway
- The structure must be available to the mfoc_clocksource_read_cnt() function as it is used as a sched_clock.
Initialization
Driver init
Since the FPGA is connected to the memory bus, we will register a platform driver:
1static const struct of_device_id mfoc_of_match[] = {
2 { .compatible = "mfoc,mfoc-clocksource", },
3 { }
4};
5
6static struct platform_driver mfoc_driver = {
7 .driver = {
8 .name = "mfoc-clocksource",
9 .of_match_table = mfoc_of_match,
10 },
11};
The mfoc_clocksource_probe() will then be called for each registered device (which can only one).
Device probe
1static int __init mfoc_clocksource_probe(struct platform_device *pdev)
2{
3 struct device *dev = &pdev->dev;
4 struct clocksource *cs;
5 unsigned long rate;
6 int err;
7
8 mfoc = devm_kzalloc(dev, sizeof(*mfoc), GFP_KERNEL);
9 if (!mfoc)
10 return -ENOMEM;
11
12 mfoc->regs = devm_platform_ioremap_resource(pdev, 0);
13 if (IS_ERR(mfoc->regs))
14 return PTR_ERR(mfoc->regs);
15
16
17 mfoc->clk = devm_clk_get_enabled(dev, "mfoc");
18 if (IS_ERR(mfoc->clk))
19 return PTR_ERR(mfoc->clk);
20
21 rate = clk_get_rate(mfoc->clk);
22
23 cs = &mfoc->cs;
24 cs->name = "mfoc";
25 cs->rating = 500;
26 cs->flags = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_SUSPEND_NONSTOP;
27 cs->mask = CLOCKSOURCE_MASK(32);
28 cs->read = mfoc_clocksource_read;
29
30 err = clocksource_register_hz(cs, rate);
31 if (err) {
32 dev_err(dev, "clocksource registration failed");
33 return err;
34 }
35
36 sched_clock_register(mfoc_clocksource_read_cnt, 32, rate);
37
38 return 0;
39}
The mfoc_clocksource_probe() function is responsible for a few things:
- Allocate the per-device data structure
- Get the rate from the provided clock
- Populate the struct clocksource structure
- Register the clocksource with clocksource_register()
- Register the clock as a sched_clock.
This is how the struct clocksource is populated:
1 cs = &mfoc->cs;
2 cs->name = "mfoc";
3 cs->rating = 500;
4 cs->flags = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_SUSPEND_NONSTOP;
5 cs->mask = CLOCKSOURCE_MASK(32);
6 cs->read = mfoc_clocksource_read;
The rating of 500 is not very honest.. but it makes sure that this clocksource is selected by the system. CLOCK_SOURCE_IS_CONTINUOUS and CLOCK_SOURCE_SUSPEND_NONSTOP is set as flags to tell the system that this clock is continous and will not stop during a suspend.
Other possible flags are:
1#define CLOCK_SOURCE_IS_CONTINUOUS 0x01
2#define CLOCK_SOURCE_MUST_VERIFY 0x02
3
4#define CLOCK_SOURCE_WATCHDOG 0x10
5#define CLOCK_SOURCE_VALID_FOR_HRES 0x20
6#define CLOCK_SOURCE_UNSTABLE 0x40
7#define CLOCK_SOURCE_SUSPEND_NONSTOP 0x80
8#define CLOCK_SOURCE_RESELECT 0x100
9#define CLOCK_SOURCE_VERIFY_PERCPU 0x200
mfoc_clocksource_read
1static u64 notrace mfoc_clocksource_read_cnt(void)
2{
3 u32 time_in_us;
4
5 iowrite16(0x0001, mfoc->regs + SYSREADTRIG);
6 ndelay(400);
7
8 time_in_us = ioread16(mfoc->regs + SYSTIMLO);
9 time_in_us |= ioread16(mfoc->regs + SYSTIMHI) << 16;
10
11 return (time_in_us);
12}
13
14static u64 notrace mfoc_clocksource_read(struct clocksource *cs)
15{
16 return mfoc_clocksource_read_cnt();
17}
These are callback functions used by the clocksource and sched_clock. It writes 0x1 to SYSREADTRIG to lock the counter values and then read the system time.
Full driver implementation
1// SPDX-License-Identifier: GPL-2.0
2/*
3 * MFOC System Timer clocksource
4 *
5 * Copyright (C) Marcus Folkesson <marcus.folkesson@gmail.com>
6 */
7
8#include <linux/clk.h>
9#include <linux/clocksource.h>
10#include <linux/of.h>
11#include <linux/platform_device.h>
12#include <linux/regmap.h>
13#include <linux/sched_clock.h>
14
15#define SYSREADTRIG (0x00)
16#define SYSTIMLO (0x04)
17#define SYSTIMHI (0x06)
18
19struct mfoc_clocksource {
20 void __iomem *regs;
21 struct clocksource cs;
22 struct clk *clk;
23};
24
25static struct mfoc_clocksource *mfoc;
26
27static u64 notrace mfoc_clocksource_read_cnt(void)
28{
29 u32 time_in_us;
30
31 iowrite16(0x0001, mfoc->regs + SYSREADTRIG);
32 ndelay(400);
33
34 time_in_us = ioread16(mfoc->regs + SYSTIMLO);
35 time_in_us |= ioread16(mfoc->regs + SYSTIMHI) << 16;
36
37 return (time_in_us);
38}
39
40static u64 notrace mfoc_clocksource_read(struct clocksource *cs)
41{
42 return mfoc_clocksource_read_cnt();
43}
44
45static int __init mfoc_clocksource_probe(struct platform_device *pdev)
46{
47 struct device *dev = &pdev->dev;
48 struct clocksource *cs;
49 unsigned long rate;
50 int err;
51
52 mfoc = devm_kzalloc(dev, sizeof(*mfoc), GFP_KERNEL);
53 if (!mfoc)
54 return -ENOMEM;
55
56 mfoc->regs = devm_platform_ioremap_resource(pdev, 0);
57 if (IS_ERR(mfoc->regs))
58 return PTR_ERR(mfoc->regs);
59
60
61 mfoc->clk = devm_clk_get_enabled(dev, "mfoc");
62 if (IS_ERR(mfoc->clk))
63 return PTR_ERR(mfoc->clk);
64
65 rate = clk_get_rate(mfoc->clk);
66
67 cs = &mfoc->cs;
68 cs->name = "mfoc";
69 cs->rating = 500;
70 cs->flags = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_SUSPEND_NONSTOP;
71 cs->mask = CLOCKSOURCE_MASK(32);
72 cs->read = mfoc_clocksource_read;
73
74 err = clocksource_register_hz(cs, rate);
75 if (err) {
76 dev_err(dev, "clocksource registration failed");
77 return err;
78 }
79
80 sched_clock_register(mfoc_clocksource_read_cnt, 32, rate);
81
82 return 0;
83}
84
85static const struct of_device_id mfoc_of_match[] = {
86 { .compatible = "mfoc,mfoc-clocksource", },
87 { }
88};
89
90static struct platform_driver mfoc_driver = {
91 .driver = {
92 .name = "mfoc-clocksource",
93 .of_match_table = mfoc_of_match,
94 },
95};
96builtin_platform_driver_probe(mfoc_driver, mfoc_probe);
Device tree node
The corresponding devicetree node:
1 mfocclock: mfocclock {
2 #clock-cells = <0>;
3 compatible = "fixed-clock";
4 clock-frequency = <1000000>;
5 };
6
7 mfoc-clocksource@640000e4 {
8 compatible = "mfoc,mfoc-clocksource";
9 reg = <0x640000e4 0x8>;
10 pinctrl-names = "default";
11 clocks = <&mfocclock>;
12 clock-names = "mfoc";
13 status = "okay";
14 };
Summary
We can verify that our clocksource is used by the Linux system by reading the current_clocksource from sysfs:
1$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
2mfoc tim34
3$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
4mfoc
Most clocksource drivers are quite simple, it's usually about reading a timer value and reporting back after all.