Linux Engineer's random thoughtshttps://anisse.astier.eu/2017-09-29T00:00:00+02:00Kernel Recipes 2017 day 3 notes2017-09-29T00:00:00+02:002017-09-29T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2017-09-29:kernel-recipes-2017-day-3.html<p>This is continuation of <a href="kernel-recipes-2017-day-1.html">day 1</a> and <a href="kernel-recipes-2017-day-2.html">day 2</a> of Kernel Recipes 2017.</p> <h1>Using Linux <code>perf</code> at Netflix</h1> <p>by Brendan Gregg</p> <p>Brendan started with a ZFS on Linux case study, where it was eating 30% of the CPU resources, which it should never be doing. He started by generating a flame graph with perf, through Netflix's <a href="https://github.com/Netflix/vector">Vector</a> dashboard tool. It was confirmed instantly, despite the initial hunch. This was then quickly thought to be the container teardown cleanup using lots of resources. The only issue here, is that this particular project never used ZFS. It was in fact the free code path trying to get real entropy to free empty lists. It was later fixed in ZFS.</p> <p>A particular point underlined is that when profiling, you want to see everything, from the kernel, to userspace C or Java code. perf allows doing that, because it has no blind spots, is accurate and low overhead.</p> <p>This is useful at Netflix, because they scale the number of instances based on the percentage of CPU usage. At Netflix scale, a small performance improvement might lead to a scale-down saving the company a lot of money. While perf can do many things, Netflix uses it to profile CPU usage 95% of the time.</p> <h2>perf basics</h2> <p>perf originated from implementation of CPU Performance Monitoring Counters (PMCs) in Linux, and supports many features.</p> <p>The main workflow is to do a <code>perf list</code> to look at the available tracepoint events, then <code>perf stat</code> to count particular events. <code>perf record</code> allows capturing and dumping the events to the file system, <code>perf report</code> or <code>perf script</code> is used to analyze a dumped perf data. <code>perf top</code> can be used to look at events in real-time.</p> <p>Brendan maintains a list of <a href="http://www.brendangregg.com/perf.html#OneLiners"><code>perf</code> one-liners</a>, useful to explore and learn about perf capabilities.</p> <p>Brendan came up with <a href="https://github.com/brendangregg/FlameGraph">Flame Graphs</a> when he was profiling a MySQL issue. It's a perl script that converts input data to svg. To use it with <code>perf</code>, use <code>stackcollapse-perf.pl</code> with <code>perf script</code>, and feed the output into <code>flamegraph.pl</code></p> <h2>Gotchas</h2> <p>An important thing is to have working stack traces and symbol resolving working. To fix stack traces you should either use frame-pointer based stack walking, libunwind or DWARF. You probably want <code>-fno-omit-frame-pointer</code> into your gcc option lists for C code. For Java, you might want to use <code>perf-map-agent</code> to do symbol resolution and de-inlining.</p> <p>When you go to instruction-level, the problem is that resolution isn't really precise, so you don't really know which one you're executing. This is because of modern out-of-order CPU architecture. Intel's PEBS helps with this issue.</p> <p>When using VMs, you might want to have you hypervisor (Xen, etc.) enable PMCs for your OS and handle this properly. For containers, <code>perf</code> might have issues finding the symbol files, since they are in a different namespace; this is fixed in 4.14.</p> <p>In conclusion, there's a lot to say about perf, and this talk only scratched the surface of what's possible; Brendan pointed us to the many resources available about it online.</p> <h1>The Serial Device Bus</h1> <p>by Johan Hovold</p> <p>While serial buses are ubiquitous, the TTY layer failed at modeling the associated resources with a serial line.</p> <p>The TTY layer exposes a character device to userspace. It supports line discipline for switch modes, handling errors, etc.</p> <p>It's possible to write drivers on top in userspace, and Johan used gpsd as example of this. But you need to know in advance the associated Port and resources aren't necessary accessible. And you lose the ability to interact with other subsystems in the kernel. Another example of this is bluetooth, where you register further devices (hci0) in order to be able to control the line-discipline and properly initialize ports.</p> <p>To initialize the bluetooth, you use <code>hciattach</code> to configure a tty as bluetooth device, then the hci device appears, and then you use <code>hciconfig</code> to manage this device. The problem with this type of ldisc drivers is that you lose control over some information to userspace, and you don't have the full picture for GPIOs, and other resources for handling power management for example.</p> <h2>Serial Device Bus</h2> <p><code>serdev</code> was originally written by Rob Herring; it was created as bus for UART-attached device. It was merged in 4.11, but enabled in 4.12 follwing some issues.</p> <p>The new bus name is "serial"; it refers to <code>servdev</code> controllers and clients (or slaves). The only controller available is the TTY-port controller. The hardware description happens in the Device Tree.</p> <p>serdev allows a new architecture, with simpler interaction and layering, without the need to have userspace change the mode of a TTY first, since all the necessary data is in the Device Tree. For bluetooth, this would mean hci0 would appear at dt probe time, making it possible to use <code>hciconfig</code> directly.</p> <p>There are currently three bluetooth drivers using this infrastructure in the kernel, as well as one ethernet driver (qca_uart).</p> <p>The main limitation is that it's serial-core only. While it only supports Device Tree, this is being worked on to add ACPI. Hotplug support isn't solved either. Multiplexing for supporting multiple slaves patches have been posted.</p> <h1>eBPF and XDP seen from the eyes of a meerkat</h1> <p>by Éric Leblond</p> <p>Suricata is an open-source Intrusion Detection System that relies on kernel features. It starts with dumping all packets at the IP level with linux raw sockets, then does stream reconstruction and application protocol analysis. It works at 10GB/s in normal use in enterprise networks. It analyses the data, and output JSON, or even a web dashboard.</p> <p>Suricata uses linux raw sockets with <code>AF_PACKET</code> in memory-mapped fan-out mode for multi-threaded processing.</p> <p>One issue Suricata encountered was the asymmetrical hash being changed in Linux 4.2, breaking ordering so that Suricata couldn't properly analyze the streams. This was fixed later in 4.6.</p> <h2>eBPF</h2> <p>eBPF came to the rescue by enabling Suricata to customize the hash function, and then properly tag packets so that they go to the proper thread (load-balanced), hence preserving ordering.</p> <p>Another issue related to load-balancing, is the big flow handling, that is hard to handle without losing packets or ordering. One solution is to discard select packets, by bypassing certain packets as soon as possible in the kernel to reduce performance impact.</p> <p>Suricata implemented a new "stream depth" bypass that allows to start discarding after the flow started, while still capturing the most interesting part at the beginning.</p> <p>For the kernel part of this bypass implementation, nftables did not work because it was too late in the process, after <code>AF_PACKET</code> handling. An eBPF filter using maps helped Suricata achieve this.</p> <p><code>bcc</code> didn't match Suricata requirements, so they used <code>libbpf</code> which is hosted inside the kernel in <code>tools/lib/bpf</code>. Eric says it's easy enough to use.</p> <h2>XDP</h2> <p>The eXtreme Data Path (XDP) project was started to give access to raw packet data from the network card, before it reaches the Linux network subsystem, creating an skb. You can even interact with it using an eBPF filter. This needs modified drivers, and many are already supported; in 4.12 there's even a generic driver usable for development, but less performant.</p> <p>Eric started integrating XDP in Suricata, and found that it meant doing more parsing since it was raw packets. <code>libbpf</code> support isn't done yet either. To hand over the capture to userspace, the strategy is to use the perf event system, with its memory mapped ring buffer.</p> <p>This is still a bit fresh, Eric says, but promising and very efficient.</p> <h1>HDMI CEC Status Report</h1> <p>by Hans Verkuil</p> <p>The Voyager space probe sent in 1977 communicates at 1477 bits per second, and CEC is a bus that communicates at 400 bits per second, making Hans the maintainer of the slowest bus in the Kernel.</p> <p>CEC is an option part of HDMI that provides high level functions and communications for Audio and Video products. It's a 1-line protocol. It has physical addresses, the TV always being 0, and inputs have others. Then there are logical addresses from 0 to 15. </p> <h2>Features</h2> <p>CEC allows waking up, shutting down a device (TV or else), switch sources, getting remote passthrough. You can tell also tell other devices the name of your device. You can also configure the Audio Return Channel (ARC) to send the audio from the sink (TV) to a device through the HDMI Ethernet pins.</p> <p>Inside the kernel, the CEC framework implements most of the features. The drivers only need to implement the low-level CEC adapter operations. It handles core messages automatically, but you can also get them if you enable passthrough. If you need to assemble or decode CEC messages, there's a BSD and GPL-licensed header-only implementation in <code>cec-funcs.h</code> that can be used by applications. The framework driver API is pretty compact and simple to implement.</p> <p>The userspace API has various messages to set a physical or logical address, set the mode of the fd, etc. </p> <p>The Hotplug Detect use case is complex, since it depends on the status of the HDMI Hotplug Detect Pin (HDP). If the pin is down, some devices won't be able to send CEC messages. Some TVs turn off HPD, but still receive CEC messages. Hans says that the most reliable way to wakeup a TV is to just send a message, regardless of the HPD status. It's out-of-spec, but is the only way to make it work.</p> <p><code>cec-ctl</code> is the tool that implements the userspace API and allows interacting with the framework from the command line.</p> <p>In kernel 4.14, many devices are now supported, including the Raspberry Pi. It can now be emulated with the <code>vivid</code> driver. It passed CEC 1.4 and 2.0 compliance tests. This makes Linux the only OS with built-in CEC support, Hans says.</p> <p>In the pipeline, is support for many new devices, as well as a brand new <strong>cec-gpio</strong> driver allowing to do bit-banging of CEC over a GPIO. It also allows injecting errors, but this should come later.</p> <h1>20 years of Linux Virtual Memory</h1> <p>by Andrea Arcangeli</p> <p>Virtual Memory(VM) is practically unlimited and costs virtually nothing, virtual pages point to physical pages, which is the real memory.</p> <p>In x86, the pagetable format is a radix tree. With traditional 3 levels of pages tables you can have 256TiB of memory; with 5-level page tables, you can address 128PiB of memory, but it has a performance impact.</p> <p>The VM algorithms in Linux use heuristics to solve a hard problem of using the memory as best as possible. One such choice is to have overcommit by default. Or to use all free memory as cache.</p> <p>In the VM, the basic structure is struct page. It's currently 64 bytes, and is using 1.56% of all memory in a given system.</p> <p>MM is the memory of a process, and is shared by threads. <code>virtual_memory_area</code> VMA is inside the MM. The LRU cache is combination of two lists of recently used pages, and uses an active and inactive optimum balancing algorithm. The status of those lists is visible in <code>/proc/meminfo</code>.</p> <p>Reverse mapping of the objects (objrmap) is used as well to find reverse references of pages to processes.</p> <p>There are other LRUs for anonymous and file-based mappings, or cgroups.</p> <h2>Trends</h2> <p>Automatic NUMA Balancing helps running various workloads, without having to adapt it to NUMA mode with hard bindings.</p> <p>Transparent Hugepages are a way to automatically use huge pages if an application uses lots of memory, instead of manually with hugetlbfs.</p> <p>The MMU notifier allows reducing page pinning, making it possible to swap-out DMAed memory with proper driver interactions.</p> <p>HMM or Unified Virtual Memory allows going even furthers for GPU and seamless computing, without requiring cache-coherency.</p> <p>Andrea showed auto-NUMA balancing benchmarks, and it improves transactions as much as 10%. A remark from the audience showed that in some pathological cases, the performance might actually be worse, but the feature can be disabled.</p> <h2>Huge Pages</h2> <p>With hugepages, you can go from 4KiB pages to 2MiB pages. This allows completely removing a pagetable level, and thus improving performance in some cases. But it has a cost when clearing pages, making it less cache friendly. In the last case, a huge improvement in performance was seen when clearing the faulting sub-page last, so that it's still in the cache.</p> <p>Transparent Hugepage (THP) works by simply sending 2M pages when the mmap region is 2M aligned, and the request is big enough. It is tunable in <code>/sys/kernel/mm/transparent_hugepage</code>; it can be disabled, enabled only for madvise, or always. The THP defragmentation/compaction is also tunable.</p> <p>Since Linux 4.8, it's possible to use THP with tmpfs and shmem. This is also tunable and disabled by default.</p> <h2>KSM and userfaultfd</h2> <p>Virtual memory deduplication (KSM) is practically unlimited, affecting migration during compaction for example; with KSMscale, a maximum limit is set on per-physical pages dedup, the default is 256, so that a given KSM would only be referenced by 256 virtual pages; this is tunable. Answering a question from the audience, Andrea said that if you care about cross-VM sidechannel attacks, you should probably disable KSM after disabling HyperThreading.</p> <p>userfaultfd allows userspace more visibility and control over page-faulting. It enables postcopy live migration with VMs (efficient snapshotting). It can be used to drop write bits for with JITs, and has many other uses.</p> <p>Andrea concluded that he is amazed with the room for innovation to continue further improvements, after 20 years of working with the Linux memory management.</p> <h1>An introduction to the Linux DRM subsystem</h1> <p>by Maxime Ripard</p> <p><a href="http://free-electrons.com/pub/conferences/2017/kr/ripard-drm/ripard-drm.pdf">Presentation slides</a></p> <p>In the beginning, there was the framebuffer. That's how fbdev was born, to do very basic graphics handling. Then, GPUs came along, getting bigger and bigger. In parallel in the embedded space, piles of hack were accumulated in display engines to accelerate some operations.</p> <p>At first, <em>DRM</em> was only for GPUs' needs, without any kind of modesetting. It required to map device registers to userspace so that it would do it. But since Kernel Mode-Setting (<em>KMS</em>), this has moved back into the kernel.</p> <p>fbdev is now obsolete, and dozens of ARM drm drivers have been merged since 2011.</p> <p>Traditionally in embedded devices, there were two completely different devices for the GPU and the display engine. In Linux, there's the divide between DRM and KMS.</p> <p>KMS has <em>planes</em>, that can be used for double-buffering. It also has the <em>CRTC</em>, that does the composition. <em>Encoders</em> take the raw data from the CRTC, and convert it to a useful hardware bus format (HDMI, VGA). <em>Connectors</em> output the data, handle hotplug events and EDIDs.</p> <p>In the DRM stack, <em>GEM</em> can be used to allocate and share buffers without copy with the kernel. <em>PRIME</em> can interact with <em>GEM</em> and dma-buf to also handle buffers shared with hardware.</p> <p>Vendors also have their own solutions, like ARM's Mali proprietary driver. Blob access for userspace is tightly controlled.</p> <h1>Build farm again</h1> <p>by Willy Tarreau</p> <p>This is a followup of <a href="https://anisse.astier.eu/kernel-recipes-2016-notes.html">last year's</a> <a href="https://lwn.net/Articles/702375/">presentation</a>. The old build farm had shortcomings: it wasn't reliable (HDMI sticks), had a bad power supply, and heating issues. Yet the RK3288 was quite powerful, so Willy wanted to try again with the same CPU.</p> <p>He got 10 MiQi boards, which are even faster thanks to dual-channel DDR3, although still having shortcomings when combining them with foam. Willy fixed the heatsink, by using a 3M thermal tape. Instead of microUSB, Willy simply soldered thicker cables directly on the board. And to solve the switch attrition issue, he tried a Clearfog-A1 board.</p> <p>distcc was updated to the latest version for more flexibility, and bumped settings in order to saturate all the cores on all CPUs. LZO compression helped reducing upload time. He also found that there was a hardcoded limit of 50 parallel jobs in distcc, and fixed it. </p> <p>He improved the distcc distribution using haproxy in front with the leastconn algorithm, this helped a lot.</p> <p>Using the cluster in addition to his local beefy machine, he went from 13 minutes for kernel builds to 4m45s.</p> <p>To help with monitoring, Willy submitted a new <code>led-activity</code> LED trigger for the kernel to change the blinking speed depending on CPU usage.</p> <p>To build haproxy, he went from 11s to 3s with the added farm. With up to 200 builds a day, it saves less than half an hour per day.</p> <p>Feedback was sent to MiQi's maker; patches to distcc. The quest for a good USB power supply continues. Willy is now exploring alternative boards for even faster builds.</p> <p><em>(That's it for Kernel Recipes 2017! See you next year!)</em></p>Kernel Recipes 2017 day 2 notes2017-09-28T00:00:00+02:002017-09-28T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2017-09-28:kernel-recipes-2017-day-2.html<p>This is continuation of <a href="kernel-recipes-2017-day-1.html">yesterday's live blog</a> of Kernel Recipes 2017.</p> <h1>Linux Kernel Self Protection Project</h1> <p>by Kees Cook</p> <p><a href="http://outflux.net/slides/2017/kr/kspp.pdf">Presentation slides</a></p> <p>The aim of the project is more than protecting the kernel.</p> <h2>Background</h2> <p>Kees' motivation for working on Linux, is the two billion Android devices running a Linux. The majority of those are running a 3.4 kernel.</p> <p>CVE lifetimes — the time between bug introduction and fix — are pretty long, averaging many years.</p> <p>Kees says the kernel team is fighting bugs, they are finding them, but just doing that isn't enough. The analogy Kees gave was that the Linux security is in the same place the car industry was in the 60s, where most work done was on making sure the car worked, but not necessarily that they were safe.</p> <p>Killing bug classes is better than simply fixing bugs. There's some truth in the upstream philosophy that all bugs might be security bugs. Shutting down exploitation targets and methods is more valuable in the long term, even it has a development cost.</p> <p>Modern exploit chains are built on a series of bugs, and just breaking the chain at one point is enough to stop or delay exploitation.</p> <p>There are many out-of-tree defenses that have existed over the years: PaX/grsec, or many articles presenting novel methods that were never merged upstream. Being out-of-tree is not anything special, since the development mode in Linux is to fork. Distros integrate custom mitigations, like RedHat's ExecShield, Ubuntu's AppArmor, grsecurity or Samsung's Knox for Android.</p> <p>But in the end, upstreaming is the way to go, Kees says. It protects more people, reduces maintenance cost, allowing to focus on new work instead of playing catch-up.</p> <p>Many defenses are the powerful because it's they're not the default, and aren't widly examined. Kees gave an example of custom email server configuration that were very effective to fight spam <em>because</em> they're not the default, otherwise the spammers would adapt.</p> <p>Kees then showed another example with grsecurity, where the stack clash protection was not upstreamed, not reviewed, and was in the end weaker than the solution finally merged upstream.</p> <h2>Kernel self protection project</h2> <p>In 2015, Kees announced this project because he realized he wouldn't be able to do all the upstreaming work by himself. It is now an industry-wide project, with many contributors.</p> <p>There are various type of protections: probabilistic protections reduce the probability of success of an exploit. Deterministic protection completely block an exploitation mechanism.</p> <p>Stack overflow and exhaustion is an example of bug stack that was closed down upstream with vmap stack. Kees is still porting a pax and grsecurity gcc plugin to work on that. The stack canary is essential as well, Kees said. For instance, it mitigates the latest BlueBorne vulnerability.</p> <p>Integer over/underflow protection went inside the kernel with the new refcount patches. Buffer overflows are mitigated upstream through Hardened user copy or recent FORTIFY_SOURCE integration. Format string injection was mitigated in 3.13 when the %n format option was completely removed.</p> <p>Kernel pointer leak isn't entirely plugged, despite various fixes. Uninitialized variable was mitigated through porting of the structleak PaX gcc plugin. Kees says it's more than an infoleak, and this might be exploited in some cases.</p> <p>Use-after-free was mitigated with page zero poisoning in Linux 4.6, and freelist randomization in 4.7 and 4.8.</p> <h2>Exploitation</h2> <p>The basic is to find the kernel in memory (e.g through kernel pointer leaks). To mitigate this, there's various types of kASLR or the ported grsecurity randstruct plugin.</p> <p>A very basic protection is to make sure executable memory cannot be writable, and this was merged for various architectures a long time ago.</p> <p>Function pointer overwrite is a very standard exploitation method, and this was mitigated by the pax constify plugin, and then the ro_after_init annotation in the kernel.</p> <p>Mitigating userspace execution is still a work in progress on x86, but arm64 already fixes for that.</p> <p>The next stages are mitigating user data reuse, and reused code chunks (ROP), PaX has a RAP closed-source technology to do this.</p> <h1>Understanding the Linux Kernel via ftrace</h1> <p>by Steven Rostedt</p> <p>Steven started by saying that this talk is really fast, and you should watch it three times to understand it.</p> <p>Ftrace is an infrastructure with several features. Ftrace is the exact opposite of security hardening: it gives visibility in the kernel, provides instrumentation to do live-kernel patching, and of course rootkits.</p> <p>Ftrace is already in the kernel. It was usually initially interacted with through <code>debugfs</code>, but it now has its own fs, <code>tracefs</code>, mountable in <code>/sys/kernel/tracing</code>. All files and even documentation are in there, so it's usable through echo and cat because Steve wanted that busybox be enough to control these features. This is were the described files are in the rest of the talk.</p> <p>The basic file is <code>trace</code>, showing the raw data. Then there's <code>available_tracers</code>. The default tracer is the <code>nop</code> one, which does nothing. The most interesting one is the <code>function</code> tracer, that shows every called function in the kernel. The most beautiful one, according to Steve is the <code>function_graph</code> tracer that follows the call graph.</p> <p>The <code>tracing_on</code> file controls the writes the ring buffer. Tracing infrastructure is still here, but the ring buffer isn't filled with data. It's there for temporary pauses of tracing.</p> <p>There are few files that allow limiting ftrace to filter the output: <code>set_ftrace_filter</code> for example matches the function names, and supports glob matching, appending, or clearing.</p> <p>The file <code>available_filter_functions</code> shows the available functions; it does not include all kernel functions, depending on gcc instrumentation(inline functions, and annotated non-traceable functions (timers, ftrace itself, boot time code).</p> <p>When using the <code>function</code> tracer, it shows the function calls as well as the parent.</p> <p>The filter file <code>set_ftrace_pid</code> limits function executed by a given task. If you have multiple threads, it's the thread id.</p> <p>To trace syscalls, you need to know that the definition macros add a <code>sys_</code> prefix to the syscall names. If you want to trace the read syscall, you should trace the <code>SyS_read</code> function, because the upper case function comes first. You can find it in the <code>available_filter_functions</code> file.</p> <p>The <code>set_graph_function</code> filter helps when you want to trace starting from a given point, and follow the call graph, accross function pointer boundaries, giving you insight that's harder to get with just the code. Steven gave an example with the <code>sys_read</code> syscall, where you can know exactly which function is called, even when you have the file_operations structure making code reading harder, but the graph is very clear. You can combine this with <code>set_ftrace_notrace</code> to set a boundary of functions or <code>set_graph_notrace</code> for call graphs you're not interested in, to ease reading the call graph and reduce the ftrace performance impact.</p> <p>There are many options in the <code>options</code> directory or the <code>trace_options</code> file. Steven likes the <code>func_stack_trace</code> option: it creates a stack trace of traced functions. Be careful, if you don't set a filter, it's going to bring your machine to a knee. Also remember to turn it off when done. <code>sym_offset</code> or <code>sym_addr</code> options show the function relative and absolute locations in memory.</p> <p>When you set a filter starting with <code>:mod:module_name</code>, it will trace all the functions in a given module.</p> <p>Function triggers are useful when you want a start a tracing, stop tracing, or even add a stacktrace when a function is it. For example you do set a filter with <code>function_name:stacktrace</code>, and it will give you stacktrace everytime this particular function is called.</p> <p>When interrupted, you might not want to see the interrupt function graph: there's a default-on option <code>funcgraph-irqs</code> that does just that if you turn it off.</p> <p>It's possible to limit the graph depth of the <code>function_graph</code> tracers with the <code>max_graph_depth</code> option.</p> <p>You can also trace with events. The events are listed by subsystems in the <code>events</code> directory. The most commonly used ones are <code>sched</code>, <code>irq</code> or <code>timer</code> families of events. You enable events separately of the specific tracers. If you only want events, use the <code>nop</code> tracer, but this can be combined with the others.</p> <p>There are two useful options to control event and function tracing: <code>event-fork</code> and <code>function-fork</code> allow to continue tracing children of a traced process.</p> <p>Finally, Steve introduced the <code>trace-cmd</code> program, that wraps all the custom <code>echo</code>s and <code>cat</code>s in a single program. <code>trace-cmd</code> has nice tricks to make sure you only stack-trace a single function, and can do all you can do without it with a simpler interface.</p> <h1>Introduction to Generic PM domains</h1> <p>by Kevin Hilman</p> <p>Two years ago, Kevin did an introduction on various power management subsystems at Kernel Recipes. This talk focuses on PM domains.</p> <p>The driver model starts with the <code>struct dev_pm_ops</code>. You control the global system suspend through <code>/sys/power/state</code>, and this then calls the appropriate driver callbacks. It's very powerful, but also fragile since any driver failing will stop the whole chain. This is static power management or system-wide suspend.</p> <p>The focus of this talk is the Dynamic power management, in particular for devices.</p> <h2>Dynamic power management</h2> <p>It starts with runtime PM, a per-device idle mode, one device at a time. It's handled by the driver, based on activity. In this mode, devices are independent, and one device cannot affect other drivers. When using <code>powertop</code>, the "device stats" tell you how long your device is idle.</p> <p>The runtime PM core keeps a usage count for driver uses. When the count hits 0, the core calls runtime_suspend on a device. If you have a device on a bus_type, it sits between you and the runtime PM core. In driver callbacks, one can ensure context is saved, and the wakeups are enabled, restore context on resume, etc.</p> <p>PM domains map the architecture of power domains inside modern SoCs, where various hardware blocks are grouped in domains that can be turned on and off independently, to the Linux kernel.</p> <p>PM domains are similar to bus types in the kernel, but orthogonal since some devices might be in the same domain but different buses.</p> <h2>genpd</h2> <p>Generic PM domains (genpd) are the reference implementation of PM domains, to be able to do the grouping and actions when a device becomes idle or active.</p> <p>In order to implement a genpd, you first implement the power_on/power_off function. It's typically messaging a power domain controller on a separate core, but might be related to clock management or voltage regulators. This is then described in a Device Tree node, allowing to reorder domains for different chip revisions.</p> <p>Power domains have a notion of governors, allowing custom decision making before cutting power. It allows flexibility relative to the ramp up/down delays for example. It is usually implemented in the genpd, but there are two built-in governors like Always-on or Simple QoS governors. You can attach runtime system-wide or per-device QoS constraints to control the governors.</p> <p>There has been a lot of work recently upstream, like IRQ-safe domains, or always-on domains. Statistics and debug instrumentations were also added recently.</p> <p>Under discussion is a way to unify CPU and devices power domain management. Upstream is also interested in having a better interaction between static and runtime PM. Support for more complex domains, in order to have the same driver for an IP block whether it's used through ACPI or genpds, is still in the works.</p> <h1>Performance Analysis Superpowers with Linux BPF</h1> <p>by Brendan Gregg</p> <p><a href="https://www.slideshare.net/brendangregg/kernel-recipes-2017-performance-analysis-with-bpf">Presentation slides</a></p> <p>Boldly starting the presentation with a demo, Brendan showed how to analyze how top works, with <code>funccount</code> and <code>funcslower</code>, <code>kprobe</code>, <code>funcgraph</code> and other ftrace-based tools he wrote.</p> <p>He then switched to an eBPF frontend called <code>trace</code>, that was used to dig into the arguments of a kernel function. You can leverage eBPF even more with other tools like <code>execsnoop</code> or <code>ext4dist</code>.</p> <h2>eBPF and bcc</h2> <p>BPF comes from network filtering, originally used with tcpdump. It's a virtual machine in the kernel.</p> <p>BPF sources can be tracepoints, kprobes, or uprobes. It uses the perf event rig buffer for efficiency. You can use maps as an associative array inside the kernel. The general tracing philosophy is to have a very precise filter to only get the data you need, instead of dumping all the data in userspace, and filtering it later.</p> <p>Many features were added recently to eBPF, and it keeps being improved.</p> <p>BPF Compiler Collection (BCC) is the most used BPF frontend. It allows you to write BPF program in C instead of assembly, and load the programs. You can then combine this with a python userspace.</p> <p><a href="https://github.com/ajor/bpftrace"><code>bpftrace</code></a> is a new in-development frontend, with a simple-to-use philosophy.</p> <p>Installing bcc on your distro is becoming easier as it gets packaged. There are <a href="https://github.com/iovisor/bcc#tools">many tools</a>, each with a different use giving visibility into a different kernel part.</p> <p>Heatmaps are very useful to visualize event distribution. Flamegraphs are also very powerful when combined with kernel stacktraces generation. It's now even possible to merge userspace and kernelspace stacktraces for analysis.</p> <h2>Future work</h2> <p>Support for higher level languages to write BPF programs like <code>ply</code> or <code>bpftrace</code> is in progress.</p> <p>In conclusion, eBPF is very useful to understand Linux internals, and you should use it.</p> <h1>Kernel ABI Specification</h1> <p>by Sasha Levin</p> <p>What's an ABI ? ioctls, syscalls, and the vDSO are examples of the Linux ABI.</p> <p>Sasha repeated the ABI promise from Greg's talk yesterday. The issue, he says, is that kernel lacks tools to detect a broken ABI.</p> <p>Sometimes basic syscall argument checks are forgotten, and discovered as a security vulnerability. Sometimes, some interfaces have undefined behaviour, making the ABI stability uncertain.</p> <p>Breakage is sometimes difficult to fix when detected late, because new userspace might depend on the new behaviour.</p> <p>In the end, some userspace programs like glibc, strace, or syzkaller might rewrite their understanding of the kernel ABI, and those might be out of sync. Man pages might not document everything either, and they're not a real documentation of the ABI Contract.</p> <h2>ABI Contract</h2> <p>Right now it's in the form of kernel code. Unfortunately, code evolves, so it's not an optimal format for this.</p> <p>The goal is to fix many issues at the same time: ensure backwards compatibility, prevent kernel to userspace errors, document the contract, and encourage re-use. Sasha looked for a format that would only require writing this once, and be machine readable. <code>syzkaller</code>'s description looked like a good starting point. He wanted this to be reusable by userspace tools that need this information. And finally, he wanted to use this as a tool to help ABI fixes and fast breakage detection.</p> <p>It also helps re-assuring the distribution that the ABI promise is really kept. In Sasha's view, it would also greatly help the security aspect of things, since the ABI is the main interface by which the kernel is attacked.</p> <p>The hard part is to determine the format of this contract, document all syscalls and ioctls and write the tools to test it out.</p> <p>Sasha already started with a few system calls, and is currently looking for help to get the ball rolling.</p> <h1>Lightning Wireguard talk</h1> <p>by Jason A. Donenfeld</p> <p>Jason's background is in breaking VPNs. He wanted to create one that was more secure. That's how Wireguard was born.</p> <p>Wireguard is UDP based, and uses modern cryptographic principles. The goals is to make it simple and auditable. To prove his point, he showed that it clocks at 3900 lines of code, while OpenVPN , Strongswan or SoftEther have between 116730 and 405894 lines of code each.</p> <p>It uses normal interfaces, added through the standard <code>ip</code> tool. Jason says it's blasphemous because it breaks through the layering assumptions barriers, as opposed to IPsec for example.</p> <p>A given interface has a 1 to N mapping between Public keys and IP addresses representing the peers. To configure the cryptokey routing, you use the <code>wg</code> tool for now. Once merged, the intention to have this merged into the <code>iproute</code> project.</p> <p>In Wireguard, the interface appears stateless, while under the hood, session state, connections are handled transparently.</p> <p>The key distribution between peers is left to userspace.</p> <p>Wireguard works well with network namespaces. You can for example limit a container to only communicate through a wireguard interface.</p> <p>As a design principle, wireguard has no parsing. It also won't interact at all with unauthenticated packets, making it un-scannable unless you have the proper peer private key.</p> <p>Under the hood, it uses the Noise Protocol Framework (used by Whatsapp) by Trevor Perrin, with modern algorithms like Chacha20, Blake2s, etc. It lacks crypto agility, but support a transition path.</p> <p>To conclude, Jason says that Wireguard is the fastest, and lowest latency available VPN out there.</p> <h1>Modern Key Management with GPG</h1> <p>by Werner Koch</p> <h2>What's new</h2> <p>GnuPG 2.2 was released a few weeks ago, while 2.1 has been around for nearly 3 years. There's now easy key discovery going through key servers to search keys associated with an email address.</p> <p>You can now use gpg-agent over the network, so that you don't have to upload your private keys to a server.</p> <p>In the pipeline for version 2.3 is SHA2 fingerprinting, an AEAD mode, and new default algorithms. The goal is also to help upper applications to integrated GPG in there projects. Werner says he also wants to make the Gnuk hardware open usb token easier to buy in Europe. Improving documentation is also planned.</p> <p>GPG will be moving to ECC. While this is a well researched-field, some curves (specific ECC implementation) have a pretty bad reputation according to Werner, and some of those are required by NIST, or European standards. The new de-facto standard curves are Curve25519 and Curve448-Goldilocks.</p> <p>An advantage of ECC key signatures is that they are much shorter than RSA signature, and faster to compute for signing. Verification is slower though.</p> <h2>User experience</h2> <p>The command line interface is being improved with new <code>--quick-</code> options, that are simpler to use. There's now a quick command to generate a key, update the expiration time, add subkeys, update your email address (uid), revoke the old address, sign key, verify a key locally for key signing parties.</p> <p>The main issue with key servers is that they can't map an address to a key. Anyone can publish a key with a given email. The proper way to handle this is through the email server, but this isn't solved yet. Werner's opinion is that the Web-of-Trust is a too complex tool, he believes that Trust On First Use (TOFU) is a better paradigm.</p> <p>There are two GPG interfaces: one for humans, and one for scripting. You should always use the scripting ones with you programs, it's more stable.</p> <p>There are now import/export filters in GPG to reduce the size impact of keys with lots of signatures.</p> <p>You can now <code>ssh-add</code> keys into the <code>gpg-agent</code>. Only caveat, is that in this case, GnuPG is storing the key forever in its private key directory instead of just in memory.</p> <p>In conclusion, GPG isn't set in stone, and it keeps improving and evolving. The algorithms, user interface, scriptability are getting better.</p> <p><em>(That's it for today ! Continue reading on the <a href="kernel-recipes-2017-day-3">last day</a> !)</em></p>Kernel Recipes 2017 day 1 live-blog2017-09-27T00:00:00+02:002017-09-27T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2017-09-27:kernel-recipes-2017-day-1.html<p>Following <a href="kernel-recipes-2016-notes.html">last year attempt</a>, I'm doing a live blog of Kernel Recipes 6th edition. There's also a <a href="https://air.mozilla.org/embedded-recipes-27-sept-morning/">live stream at Air Mozilla</a></p> <h1>What's new in the world of storage for Linux</h1> <p>by Jens Axboe</p> <p>Jens started with the status of blk-mq conversions: most drivers are now converted: stec, nbd, MMC, scsi-mq, ciss. There are about 15 drivers left, but Jens says it isn't over until floppy.c is converted, re-offering the prize he offered two years ago.</p> <p>blk-mq scheduling was the only missing feature, in order to tag I/O request, have better flush handling, or help with scalability. To address this, blk-mq-sched was added in 4.11, with the "none" and "mq-deadline" algorithms. 4.12 saw the addition of BFQ and Kyber algorithms.</p> <p>Writeback throttling is a feature to prevent overwhelming the device with request, to keep peak performance high. It was inspired by the networking Codel algorithm. It was tested with io.go, and proven to improve latency tremendously on both NVMe and hard-drives.</p> <p>IO polling helps getting faster completion times, but it has a high CPU cost. A hybrid polling as added, adding predictive algorithms in the kernel to be able to wakeup the driver just before the IO completes. The kernel tracks IO completion time, and just sleeps for half the mean, allowing both fast completion time, and less CPU load leading to better power management. This is configurable through sysfs, with the proper fd configuration. Results show that adaptive polling is comparable in completion times with active polling, but with half the CPU cost.</p> <p>Faster O_DIRECT and Faster IO accounting were also worked on. IO accounting used to be invisible in profiling, but with the huge scaling efforts of the IO stack, it started showing at 1-2% in testing. In synthetic tests, disabling iostat started improving performance greatly. It was rewritten and merged in 4.14.</p> <p>A new mechanism called Write lifetime hints allows application to signal expected write lifetime with fcntl. It allows giving hint to flash based storage (supported in NVMe 1.3), of the total size of the write, making sure you won't get such a big write amplification associated with the internal Flash Translation Layer (FTL), when you do big writes. The device might make more intelligent decisions, better garbage collection internally. It showed improvements with RocksDB benchmarks.</p> <p>IO throttling was initially tied to CFQ, which isn't ideal with the new blk-mq framework. It now scales better on SSDs, supports cgroup2, and was merged for 4.10.</p> <p>Jens came back to a slide of 2015 Kernel Recipes were he predicted the future work, and all the feature previously discussed in this talk were completed in the two-year timespan.</p> <p>In the future, IO determinism is going to be focus of work, as well as continuous performance improvements.</p> <h1>Testing on device with LAVA</h1> <p>by Olivier Crête</p> <p>Continuous integration is as simple as "merge early, merge often" Olivier says. But the core of the value is more in Continuous Testing, and that's what most people think when they say CI.</p> <p>Upstream kernel code is properly reviewed, so why should it be tested, Olivier asked. Unfortunately, arm boards aren't easy to test, so the kernel used to rely on users to do the testing.</p> <p>That's until kernelci.org came along, doing thousands of compiles and boots every day, catching a lot of problems. kernelci.org is very good at breadth of testing, but not depth. If you have any serious project, you should do your own testing, with your own hardware and patches.</p> <p>Unfortunately, automation isn't ubiquitous, because the perceived value is low compared to cost. To overcome this, the first thing to have is a standardized build, single click build system, with no manual operation. The build infrastructure should be the same for everyone, and Olivier recommends using docker images.</p> <p>The second step is to close the CI loop, which is sending automated messages to the developer on failure as soon as possible. Public infrastructure in Gitlab, github or phabricator have support for CI, as well as blocking merging of anything that breaks the build.</p> <h2>LAVA</h2> <p>Linaro Automation and Validation Architecture (LAVA) is not a CI system. It just focuses on board management, making testing them easier. It can install images, do power control, supports serial, ssh, etc. It's packaged for Debian and has docker images available. It should be combined with CI system like Jenkins.</p> <p>The first thing to have is to have a way to Power on/off a board. You can find various power switch relay boards from APC, Energenie, devantech, or even other USB relays.</p> <p>LAVA supports different bootloaders: u-boot, fastboot, and others. The best strategy is to configure the bootloader for network booting.</p> <p>Lava is configured with a jinja2 template format, where you set various variables for the commands you need to connect to, reset, power on/off the board.</p> <p>Tests are defined by YAML files, and can be submitted directly through the API or via command line tools like lava-tool, lqa, etc. You specify the name of the job, timeouts, visibility, priority, and a list of actions to do.</p> <h2>Conclusion</h2> <p>You should do CI, Olivier says. It requires a one-time investment, and saves a lot of time in the end. According to Olivier, from nothing, a LAVA+Jenkins setup is at most two days of work. Adding a new board to an infrastructure, is done in one or two hours.</p> <h1>Container FS interfaces</h1> <p>by James Bottomley</p> <p>After an introduction on virtualization, hypervisor OSes. Within linux, there are two hypervisor OSes: Xen and kvm. Both use Qemu to emulate most devices, but they differ in approach. Xen introduced para-virtualization, modifying the OS to enhance emulation. But hardware advancements killed para-virt, except in a few devices. In James' opinion, the time lost in working with paravirt in Linux made it lose the enterprise virtualization market to VMWare.</p> <p>Container "guests" just run on the same kernel: there is one kernel that sees everything. The disadvantage is that you can't really run Windows on Linux.</p> <p>The container interface is mostly cgroups and namespaces. There are label-based namespaces, the first one being the network namespace. There are mapping namespace, mapping some resources to somewhere else, allowing those to be seen differently, like the PID namespace, which can map a given PID on the host to be PID 1 inside the container.</p> <p>Containers are used in Mesos, LXC, docker, and they all use the same cgroups and namespaces standard kernel API. There many sorts of cgroups(block IO, CPU, devices, etc.), but aren't a focus of the talk. James intends to focus on namespaces instead.</p> <p>James claims that you don't need any of the "user-friendly" systems, and you can just use the clone, unshare, and standard kernel syscall API to configure namespaces.</p> <h2>Namespaces</h2> <p>User namespaces are the tying it all together, allowing to run as root inside a contained environment. When you buy a machine in the cloud, you expect to run stuff on it as root. Since they give enhanced privileges to the user, the user namespaces were unfortunately the source of a lot of exploits, although there weren't any serious security breach recently since 3.14, James said.</p> <p>User namespaces also maps uids; in Linux, the shadow-utils provides a newuidmap and newgidmap for this. The user namespace hides unmapped uids, so they are inaccessible, even to "root" in the namespace. This creates an issue since a container image will mostly have the files with uid 0, which then should be mapped to the real kuid, and the fsuid accross the userspace/kernel/storage boundary.</p> <p>In kernel 4.8, the superblock namespace was added to allow plugging a usb key or running a FUSE driver in a container. But to be useful, you need a superblock, which isn't useful with bind maps, because you only have one superblock per underlying device.</p> <p>The mount namespace works by cloning the tree of mounts when you do <code>unshare --mount</code>; at first it's identical to the original one, but once you modify it it's different. But, all the modified mounts point to the same refcounted super_block structure. It might create issues when you add new mounts inside a sub-namespace, then this locks the other refcounted super_blocks from the host until you can umount the new mount, like the usb key you plugged in your container, that completely locks the mount namespace trees.</p> <p>James then did a demo, showing with <code>unshare</code> that if you first create a user namespace, you can then create mount namespaces, despite being unable to do it before entering the user namespace. It shows how you can elevate you privileges with user namespaces, despite not being root, from an outside view.</p> <p>It was then showed how you can create a file that is really owned by root by manipulating the mount points inside the user/mount namespace by using marks with shiftfs.</p> <p>shiftfs isn't yet upstream, and other alternatives are being explored to solve the issues brought by the container world.</p> <h1>Refactoring the Linux kernel</h1> <p>by Thomas Gleixner</p> <p>The main motivation for Thomas' refactoring over the years was to get the RT patch in the kernel, and to get rid of the annoyances.</p> <h2>CPU Hotplug</h2> <p>One of his pet peeves is the CPU hotplug infrastructure. At first, the notifier design was simple enough for the needs, but it had its quirks, like the uninstrumented locking evading lockdep, or the obscure ordering requirements.</p> <p>While CPU hotplug was known to be fragile, people kept applying duct tape on top of it, which just broke down when the RT patch started adding hotplug support. After ten years, in 2012, Thomas attempted to rewrite it but ran out of spare time. He picked it up again in 2015 and it was finalized in 2017.</p> <p>It started by analysing all notifiers, and adding instrumentation and documentation in order to explicit the order requirements. Then, one by one the notifiers were converted to states.</p> <p>The biggest rework, was that of the locking. Adding lockdep coverage unearthed at least 25 deadlock bugs, and running Steven Rostedt's cpu-hotplug stress test tool could find one in less than 10 minutes. Answering a question from Ben Hutchings in the audience, Thomas said that these fixes are unfortunately very hard to backport, leaving old kernel with the races and locks.</p> <p>The lessons learned are that if you find a bug, you expected to fix them. Don't rely on upstream to do that for you. There's a lot of bad code in the kernel, so don't assume you've seen the worse yet. You also shouldn't give up if you have to rewrite more things. Estimation in this context is very hard, and the original estimation of task was off by factor of three. In the end, the whole refactoring took 2 years, with about 500 patches in total.</p> <h2>Timer wheel</h2> <p>Its base concept was implemented in 1997, and extended over time. The purpose initially the base for all sort of timers, mostly for timeouts after 2005.</p> <p>Those timeouts aren't triggered most of the time, but re-cascading them caused a lot of performance issues for timers that would get canceled immediately after re-cascading. This is a process that holds a spin-lock with interrupts disabled, and therefore very costly.</p> <p>It took a 3 month effort to analyze the problem, then 2 month for a design and POC phase, followed by 1 month for implementation, posting and review process. Some enhancements are still in-flight.</p> <p>The conversion was mostly smooth, except for a userspace visible regression that was detected 1 year after the code was merged upstream.</p> <p>The takeout of this refactoring is to be prepared to do palaeontological research; don't expect anyone to know anything, or even care. And finally, be prepared for late surprises.</p> <h2>Useful tools</h2> <p>Git is the absolute necessary tool for this work, with grep/log and blame. And if you need to dig through historical code, use the tglx/history merged repository.</p> <p>Coccinelle is also very useful, but it's a bit hard to learn and remember the syntax.</p> <p>Mail archives are very useful, but they need to be searchable, as well as quilt, ctags, and of course a good espresso machine.</p> <p>In the end, this isn't for the faint of heart says Thomas. But it brings a lot of understanding on kernel history. It also gives you the skill to understand undocumented code. The hardest part is to fight the "it worked well until now" mentality. But, it is fun, for some definition of fun.</p> <h1>What's inside the input stack ?</h1> <p>by Benjamin Tissoires</p> <p>Why talk about input, isn't it working already, Benjamin asked. But the hardware makers are creative, and keep creating new devices with questionable designs.</p> <p>The usages keep evolving as well, with the ubiquitous move to touchscreen devices for example.</p> <h2>Components</h2> <p>The kernel knows about hardware protocols(HID), talks over USB, and sends evdev events to userspace.</p> <p>libinput was created on top of libevdev "because input is easy"; but it keeps being enhanced after three years, showing the simplicity of the task. It handles fancy things like gestures.</p> <p>The toolkits use libevdev, but they also handle gestures because of different touchscreen use cases.</p> <p>On top of that, the apps use toolkits.</p> <h2>The goood, bad and ugly</h2> <p>Keyboards are mostly working, so it's good. Except for that Caps Lock LED in a TTY being broken since UTF-8 support isn't in the kernel.</p> <p>Mice are old too, so they are a solved problem. Except for those featureful gaming mice, for which the libratbag project was created to configure all the fancy features.</p> <p>Most touchpads are still using PS/2, but extending the protocol to add support for more fingers. On Windows, the touchpads communicate over i2c (in addition to PS/2). Sometimes the i2c enumeration goes through PS/2, but other times through UEFI.</p> <h2>Security</h2> <p>There were a few security issues, with an issue on Chromebook where they allowed the webapp to inject HID events through the uhid driver, and this enabled exploiting a buffer overflow in the kernel.</p> <p>In 2016, the <a href="http://www.mousejack.com/">MouseJack</a> vulnerability enabled remotely hacking wireless mouses. Which meant you could remotely send key events to a computer. You could also force a device to connect to your receiver. A receiver firmware update was pushed through gnome software for Logitech mouses.</p> <h1>Linux Kernel Release Model</h1> <p>by Greg Kroah-Hartman <a href="https://github.com/gregkh/presentation-release-model">Slides</a></p> <p>While the kernel has 24.7M lines of code in more than 60k files, you only run a small percentage of that at a given time. There's a lot of contributors, and a lot of changes per hour. The rate of change is in fact accelerating.</p> <p>This is something downstream companies don't realize. They're getting behind faster than ever when not working with upstream.</p> <p>The release model is now that there's a new release every 2 or 3 months. All releases are stable. This time-based release model works really well.</p> <p>The "Cambridge Promise", is that the kernel will never break userspace. On purpose. This promise was formalised in 2007, and kept as best as possible.</p> <p>Version numbers mean nothing. Greg predict that every 4 years, the first number will be incremented, so that's we might see Linux 5.0 in 2019.</p> <p>The stable kernels are branched after each releases. They have publicly documented rules for what is merged, the most important one is that a patch has to be Linus' tree.</p> <p>Longterm kernels are special stable versions, selected once a year, that are maintained for at least 2 years. This rule is now even applied by Google for every future Android device. This makes Greg thinks he might want to maintain some of those kernels for a longer time. Since people care, the longterm kernels also have a higher rate of bugfixes.</p> <p>Greg says you should always have a mechanism to update your kernel (and OS). What if you can't ? Blame your SoC provider. He took for example a Pixel phone, where there's a 2.8M patch to mainline, for a total of 3.2M lines of running code. 88% of the running code isn't reviewed. It's very hard to maintain and update.</p> <p>Greg's stance is that all bugs can eventually be a "security" issue. Even a benign fix might become a security fix years later once someone realizes the security implications. Which is why you should always update to your latest stable kernel, and apply fixes as soon as possible.</p> <p>In conclusion, Greg says to take <strong>all</strong> stable kernel updates, and enable hardening features. If you don't use a stable/longterm kernel, your device is insecure.</p> <h1>Lightning talks</h1> <h2>Fixing Coverity Bugs in the Linux Kernel</h2> <p>by Gustavo A. R. Silva</p> <p>Coverity is a static source code analyzer. There are currently around 6000 issues reported by the tool for the Linux kernel; those are sorted in different categories.</p> <p>The first category is illegal memory access, followed by the medium category.</p> <p>Gustavo first worked on a missing break in a switch in the usbtest driver. Gustavo sent first a patch to fix the issue, then a second one to refactor the code following advices from the maintainer.</p> <p>Then he worked on arguments sent in the wrong order in scsi drivers. Following was an uninitialized scalar variable, and others. Gustavo showed many examples with obvious commenting or logic bugs. </p> <p>Tracking exactly which bugs were fixed was really useful to take note of similar issues. He sent in total more than 200 patches in three months, in twenty-six different subsystems.</p> <h2>Software Heritage: Our Software Commons, Forever</h2> <p>by Nicolas Dandrimont</p> <p>Open Source Software is important, Nicolas says. Its history is part of our heritage.</p> <p>Code disappears all the time, whether maliciously, or when a service like Google Code is shut down. </p> <p>Software Heritage is a project an open project to preserve all the open source code ever available. The main targets are VCS repositories, and source code releases. Everything is archived in the most (VCS)agnostic data model possible.</p> <p>The project heritage fetches the source code from many sources, and then deduplicates it using a Merkle tree. There are currently 3.7B source files from 65M projects. It's already the richest source code archive available, and growing daily.</p> <p>How to store all of this on a limited budget (100k€ hw budget). It all fits in a single (big) machine. The metadata is stored in PostGres, the files are in filesystems. XFS was selected, and they hit the bottlenecks pretty quickly.</p> <p>They are thinking of moving to scale-out object storage system like Ceph. The project wants to lower the bar for anyone wanting to do the same thing. They also have plans to use more recent filesystem features.</p> <p>Software Heritage is currently looking for contributors, sponsors, for this project.</p> <p><em>(That's it for day 1! Continued on <a href="kernel-recipes-2017-day-2.html">day 2</a> and <a href="kernel-recipes-2017-day-3.html">day 3</a>!)</em></p>Embedded Recipes 2017 notes2017-09-26T00:00:00+02:002017-09-26T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2017-09-26:embedded-recipes-2017-live-blog.html<p>Following <a href="kernel-recipes-2016-notes.html">last year attempt</a>, I'm doing a live blog of Embedded Recipes 1st edition.</p> <p><img alt="jpg" src="/images/er2017/01-Anne.jpg" /></p> <h1>Understanding SCHED_DEADLINE</h1> <p>by Steven Rostedt</p> <p>Every task starts as SCHED_OTHER, where each task gets a fair share of the CPU bandwidth.</p> <p>Then comes SCHED_FIFO, where it's first in, first out, a task will run until it gives up the CPU. SCHED_RR shouldn't be used said SCHED_FIFO because it works between tasks of the same priority.</p> <p>Steve gave an example of a machine that runs two tasks, one of a nuclear power plant, and one of a washing machine. The point it to show that priorities should be thought of in a system-wide view when using Rate Monotonic Scheduling. It's not as simple as which task is most important.</p> <h2>Earliest Deadline First (EDF)</h2> <p>Earliest deadline first solves some of the issues of RMS, by allowing to run times without missing deadlines.</p> <p>Steve explained that sched_yield should never be used because it's almost always buggy. Except when using SCHED_DEADLINE of course, where it can be useful.</p> <p><img alt="jpg" src="/images/er2017/02-Steven.jpg" /></p> <h2>Multi processors</h2> <p>Steve then introduced a simple example to show Dhall's effect. It shows you can't get over utilization of 1 when using EDF.</p> <p>If you want to partition EDF, it becomes similar to the packing problem, which is NP complete. A solution is to use global EDF, which constrains the problem, but can solve a special case, and get more than 1 of utilization when using multiple processors.</p> <h2>The limits of SCHED_DEADLINE</h2> <p>It has to run on all CPUs.</p> <p>It can not fork, because the tasks has been fixed.</p> <p>It's very hard to calculate the worst case execution time(WCET), and if you get it wrong, it breaks.</p> <p>Using cgroups, it's possible to configure SCHEAD_DEADLINE affinity, but it's still a long series of commands, and stuff in /proc to do, but this is being worked on, Steve says.</p> <p>It's possible to use Greedy Reclaim of Unused Bandwidth (GRUB) in order to utilize bandwidth left by some tasks, leaving more leeway to deal with WCET.</p> <h1>Proper APIs to HW video accelerators</h1> <p>by Olivier Crête</p> <p>There are various types of codecs: software, hardware, and then hardware accelerators. The last ones are the subject of Olivier's talk.</p> <p>Codecs can be used in a variety of contexts: players, encoders, streamers, transcoders, VoIP systems, content creation software, etc.</p> <p>The different use cases have different requirements: Broadcast production want high quality, and user generated content will have lower quality for example.</p> <p>Video calls care mostly about latency. When transcoding, you might care about latency if you're live, or about quality per bit if you want to store it.</p> <h2>Requirements</h2> <p>Exchange formats on the encoded side might need to support variance in packetization, byte stream, etc.</p> <p>The raw content might have different subsambling, color space, etc.</p> <p>Then, the memory layout might vary as well: is it planar (RGBRGBRGB) or packed (RRGGBBRRGGBB) ? Are there multiple planes (multiple DMAbuf fds )? Do you have alignment requirements in memory ? You might have tiled formats, with different tiled formats, compressed in-memory formats, padding, etc.</p> <p>The memory allocation can be internal or external. In Linux, you mostly care about DMAbuf.</p> <p>There might be attached metadata, per-frame like timestamps, or Vertical Ancillary Data (VANC): AfD. Inter-frame data like SCTE-35/104, like ad insertion points.</p> <p>A good API, Olivier says, should support push or pull modes for different uses. A good API should be a living, maintained project, with Open Source code as opposed to being just a specification.</p> <p><img alt="jpg" src="/images/er2017/03-Olivier.jpg" /></p> <h2>Existing (wrong) solutions</h2> <p>OpenMAX IL is everywhere because it's required by Android. But no one implements the full OpenMAX, only the Android subset, validated through CTS. The spec isn't maintained at Khronos anymore, and the last library passing the full test suite was from 2011. It's a fragmented landscape.</p> <p>OpenMAX has a specific threading and allocation model. The whole framework isn't a good API according to Olivier.</p> <p>libv4l is a "transparent" wrapper over the kernel API, but it's tied to the kernel API rules, and has limited maintenance, Olivier says.</p> <p>VA-API is more interesting, albeit Intel-specific. Still, it requires complex code, and is video-only.</p> <p>GStreamer is a whole multimedia framework, with a specific way of working(threads, allocation, etc.), not a HW acceleration API. It's not designed for low latency.</p> <p>FFmpeg/libav is kind of OK, Olivier says, but is not focused on the hardware side. MFT on Windows is close to what Olivier is seeking, but tied to Windows.</p> <h2>Simple Plugin API (SPA)</h2> <p>This is library, coming from Pipewire, matches all of Olivier's requirements: no pipeline, no framework, registered buffers, synchronous or asynchronous modes, externally-provided thread contexts, and not limited to codecs.</p> <p>It's available on github:</p> <p>https://github.com/PipeWire/pipewire/tree/master/spa</p> <p>It can work outside PipeWire, although it hasn't been picked-up elswhere yet.</p> <h1>Introduction to the Yocto Project - OpenEmbedded-core</h1> <p>by Mylène Josserand</p> <h2>Why use a build system ?</h2> <p>There are many constraints in embedded systems to match. You can try building everything manually, but despite the flexibility, it's a dependency hell, and lacks reroducibility. Binary distributions are less flexible, harder to customize, and not available on all architectures.</p> <p>A build system like Buildroot or Yocto is a middle ground between the two.</p> <p>In yocto, you have multiple tasks, to download, configure, compile, install, the builds. The tasks are grouped in recipes, and you manage recipes with Bitbake.</p> <p>Many common tasks are already defined in the OpenEmbedded core. Many recipes are available, organised in layers.</p> <p>OpenEmbedded is co-maintained by the Yocto Project and OE project. It's the base layer, the core of all the magic as Mylène says.</p> <h2>Workflow</h2> <p>Poky is a distribution built on top of the OpenEmbedded Core, and provides Bitbake.</p> <p>The general workflow is Download -&gt; Configure -&gt; Compile. You download the proper version you want with git clone.</p> <p>To add applications, add layers (compilations of recipes). There are folders in your poky directory. Always look at existing layers before creating a recipe. Do not edit upstream layer if you don't want breakage when updating.</p> <p>To configure the build, you first source the Bitbake environment, which moves you to a build folder, and gives you a set of commands in order to do the build. You can then edit the local.conf to set your MACHINE, which describes the hardware and can be found in specific BSP layers, and setup your DISTRO, which represents top-level configuration that will be applied on every build, and brings toolchains, libc, etc. And then the IMAGE, brings the apps, libs, etc.</p> <p><img alt="jpg" src="/images/er2017/04-Mylène.jpg" /></p> <h3>Creating a layer</h3> <p>When needed, you might create a layer, whether you have custom hardware, or want to integrate your own-application. You can do that with the yocto-layer tool which does the heavy-lifting. It's a good practice to create multiple layers to share common tasks/recipes between projects.</p> <p>The recipe are created in .bb files, the format that bitbake understands. The naming of the file is application-name_version.bb, and a file is split in header, source, and tasks parts.</p> <p>Mylène says it's a good practice to always use remote repositories to host app sources to make development quicker. App sources should never be in the layer directly. The folder organization should always be the same in order to find the recipes faster.</p> <p>Sometimes, you might want to extend an existing recipe, without modifying it. It's possible with the Bitbake engine, when creating .bbappend files. All .bbappend files are version specific. They can be used to add patches, or customize the install process by appending a task.</p> <h3>Creating an image</h3> <p>An image is a top-level recipe, it has the same format as other recipes, with specific variables on top, like IMAGE_INSTALL to list the included package, or IMAGE_FSTYPES for the binary format of images you want (ext4, tar, etc.).</p> <p>It's a good practice to only install what you need for your system to work.</p> <h3>Creating a machine</h3> <p>The machine describes the hardware. It contains variables related to the architecture, like TARGET_ARCH for the architecture, or KERNEL_IMAGETYPE.</p> <h1>Mainline Linux on AmLogic SoCs</h1> <p>by Neil Armstrong</p> <p>The AmLogic SoC Family has multimedia capabilities, and is used in many products. They have different products, ranging from the Cortex-A9 to Cortex A53 CPUs.</p> <p>The SoCs are very cheap at ~7$ when compared to competitors.</p> <p>Amlogic SoCs are used in many different cheap Android boxes. They are also in community boards from ODroid, the Khadas VIM1/2, NanoPi K2 or Le Potato which has been designed by BayLibre.</p> <p>The <a href="https://libre.computer">Libre Computer board</a> has been backed on Kickstarter. Mainline support is done by BayLibre, with many peripherals already working.</p> <p>The upstream support started from 4.1 by independent hackers. From 4.7, BayLibre started working on it. The bulk of the work went in 4.10 and 4.12.</p> <p>The work was concentrated on 64bit SoCs (the latest ones), but the devices are very similar inside the family.</p> <h2>Drivers so far</h2> <p>Dynamic Voltage and Frequency Scaling is a complex part of the work, since it's done on a specific CPU in the SoC, but ARM changed the protocol after some time and did not publish the old one at first.</p> <p>SCPI (the DVFS driver on this SoC) is now supported on 4.10 though.</p> <p>Kevin Hilman wrote a new eMMC host driver from the original implementation and public datasheet. It's very performant.</p> <p>At the end of 2016, Amlogic did a new variant S905X of those SoCs, and supporting it was easily done through re-architecturing the Device Tree files.</p> <p>For CVBS (analog video support), support was integrated in 4.10. For HDMI, Amlogic integrated a Synopsys DesignWare HDMI Controller, and a clean dw-hdmi bridge has been published sharing the code between different SoCs family. The PHY was custom, as well as the HPD though.</p> <p>CEC support was merged using the CEC framework maintained by Hans Verkuil. </p> <p>The Mali GPU inside the SoC does not have an open driver. The open source kernel driver is available. But the userspace shared binary is delivered as a blob, that has to be compiled by the SoC vendor to customize it.</p> <p><img alt="jpg" src="/images/er2017/05-Neil.jpg" /></p> <h2>Work in progress</h2> <p>There is still a lot of work for the Video Display: cursor plane, overlay planes, osd scaling, overlay scaling are missing for example.</p> <p>DRM Planes only have a single, primary plane, without scaling. Support for scaling, or planes with different sub-sampling (various YUVformats), overlay planes, is still missing.</p> <p>In Audio land, S/PDIF input and output is missing. I2S is working for output only through HDMI or external DAC, but the embedded stereo DAC in GXL/GXM or I2S input aren't support.</p> <p>Video Hardware Acceleration, while one of the best feature of the SoC, is still missing, Neil says. There's at least 6 month of development to have a proper V4L2 driver.</p> <h2>Community</h2> <p>There are a lot of hobbyist hacking on the Odroid-C2 board, and running LibreELEC and KODI. Many raspberry-pi oriented projects are also ported to Amlogic boards.</p> <p>There are upstream contributions from independent hackers. This is also helped by the growing of Single Board Computer (SBC) diversity with these SoCs.</p> <h1>Long-Term Maintenance, or How to (Mis-)Manage Embedded Systems for 10+ Years</h1> <p>by Marc Kleine-Budde</p> <p>Marc started by asking the audience who had Embedded Systems in the field, for how long, and which ones were still maintained. Then he asked who had to update to fix a vulnerability, and how long it took to deploy.</p> <p>The context of the talk are systems created by small teams, using custom hardware, and pushing out new products every few years, that need to be supported for more than 10 years.</p> <p>The traditional Embedded Systems Lifecycle starts with a Component Version Decision, followed by HW/SW development, then the maintenance starts. It's usually the longest phase.</p> <p>Marc showed graphs of vulnerabilities per-year in the Kernel, glibc and openssl. Despite most vulnerabilities being Denial of Services, it's still a lot. There's also the infamous "rootmydevice" in a proc file that was in a published linux-sunxi kernel from Allwinner.</p> <p>Don't trust you vendor kernel, Marc says.</p> <h2>Field Observations</h2> <ul> <li>vendor kernels are already obsolete at start of project</li> <li>the workflow for customized pre-built distributions isn't standard</li> <li>you get the worst of both world if you select "longterm" components but don't have an update concept</li> <li>if your update process isn't proven, it's bad</li> <li>there's a critical vulnerability in a relevant component at least twice a year</li> <li>upstream only maintain components for 2 to 5 years</li> <li>Server distros are made for admin interaction, and not suited to embedded systems.</li> </ul> <p>It all leads to the conclusion that Continous Maintenance is very important. </p> <p>Backporting, while simple at its core — you take a patch and apply it — doesn't scale. As you get more products, versions diverge, as you make local modifications test coverage is reduced, and after a few years, it's almost impossible to decide which upstream fixes are relevant.</p> <p>If you don't want your product to become part of botnet, you need to have a few safeguards. You need to have short time between incident and fix, have low risk of negative side effects, predict maintenance cost, and have this whole process scalable to multiple products.</p> <p>These are ingredients for a sustainable process: making sure you can upgrade in the field, review security announcements regularly, always use releases maintained by upstream, disable unused components and enable security hardening.</p> <p><img alt="jpg" src="/images/er2017/06-Marc.jpg" /></p> <h2>Development Workflow</h2> <p>It's important to submit changes to upstream to reduce maintenance effort.</p> <p>You need to automate the processes as early as possible: use CI.</p> <p>When starting a new project, use the development version of upstream projects, so that when you reach completion, it's in stable state, and still maintained, as opposed to already obsoleted.</p> <p>Every month, do periodic maintenance: integrate maintenance releases in order to be prepared, review security announcements, and evaluate impact on the product.</p> <p>When you identify a problem, apply the upstream fix, and leverage your automated build, testing and deployment infrastructure to publish.</p> <p>Marc advises using Jenkins 2 with Pipeline as Code. For test automation, take there's kernelci.org or LAVA. For redundant boot, barebox as bootchooser, u-boot/grub can do it with custom scripts as well as UEFI. For the update system, there is RAUC, OSTree or Swupdate. Finally, there are now many different rollout schedulers like hawkBit, mender.io, resin.io, but you can also use a static server or custom application.</p> <h2>Conclusion</h2> <p>Marc says that simply ignoring the problem does not work. Don't try ad-hoc fixes, it doesn't scale. Customized server distributions aren't fitted to the embedded use case.</p> <p>What works is upstreaming, process automation and having a proper workflow.</p> <h1>Developing an embedded video application on dual Linux + FPGA architecture</h1> <p>by Christian Charreyre</p> <p>The application discussed in this talk has high real time and performance constraints. It must be able to merge and synchronize images issued by 2 cameras, with safety constraints. Target latency is less than 200ms, with boot time less than 5s.</p> <p>Christian says that in a previous video application, they worked on an ARM SoC with gstreamer, but it didn't match the safety requirements, so they decided to go with a hybrid FPGA+linux solution.</p> <p>Target hardware is a PicoZED, an System On Module based on a Xilinx Zynq, which embeds and ARM processor as well as an FPGA in the SoC. Its software environment is yocto-based, and does not use the Xilinx-provided solutions Petalinux or Wind River Pulsar Linux, because of their particular quirks. Yocto is now well known and Christian decided to pick-up the meta-xilinx layer and start from that instead. All necessary layers are from the OE layer Index. </p> <p>The FPGA development are made with the Eclipse-based Xilinx Vivado tool, which enables scripting with tcl.</p> <p>The AXI bus is used to communicate between the Linux host and the FPGA design. It allows adding devices accessible from Linux, extending the capabilities: for example, a new serial line, dedicated hardware. It also allows dynamically changing the video pipeline by changing the parameters.</p> <p><img alt="jpg" src="/images/er2017/07-Christian.jpg" /></p> <h2>Boot mechanism</h2> <p>The PicoZed needs a First Stage Boot Loader (FSBL), before u-boot. This FSBL is generated by the Vivado IDE according to the design. The FSBL then starts u-boot, which starts Linux.</p> <p>The FPGA can't start alone, and it's code (bitstream) is loaded by the FSBL or u-boot. The Xilinx Linux kernel has a drivers for devices programmed in the FPGA. It uses device tree files to describe the specific configuration available at the moment. Vivado generated the whole device tree, not just the part for the Programmable Logic (FPGA), it merges the two in a single system.dts file.</p> <p>It's a good idea to automate the process of rebuilding the device tree after each change in Vivado, Christian says.</p> <p>The boot is comprised of several tasks before showing an image, making boot time optimization a complex problem: FSBL, u-boot, bitstream loading, kernel start, etc. Various techniques were used to reduce boot time. Inside u-boot, the bootstage report was activated, some devices init were disabled.</p> <p>Bootchart was used to profile Linux startup: the kernel size was reduced, the system console removed, and the init scripts reordered. Filesystem checks were bypassed by using a read-only filesystem. SPI bus speed was increased. Other techniques were used, and the 5 second goal was met.</p> <h2>Closing words</h2> <p>While the design of the system was done so that only the part on the FPGA is impacted by the certification process, the bitstream code is still updated through Linux on the network. Therefore code signing was used in the installer and updater mechanisms to protect the integrity of the system.</p> <p>According to Christian, the project has many unknown before starting, but those were surmounted. The splitted design constraint payed off. The choice of meta-xilinx layer is good one, because of its good quality. You only need to understand that the device tree is not built within the kernel; once you understand the general structure, it's working well, and the distribution is well tailored to the requirements.</p> <h1>Lightning talks</h1> <h2>Atom Linux</h2> <p>by Christophe Blaess</p> <p><a href="https://github.com/AtomLinux">Atom Linux</a> is a new embedded linux distro designed by Christophe. It's a binary distribution, but definitely embedded-oriented. It aims to be industrial-quality.</p> <p>Atom Linux targets small companies, that already have an embedded Linux project, but with poor embedded Linux knowledge. It aims to provide a secure update system (with rollback, factory defaults, etc.). It want to be power-failure proof with a read-only rootfs, and data backup.</p> <p>It's easy to configure Christophe says. The base system is already compiled. It provides a UI for configuration. It aims to make custom code integration simple by providing a toolchain in a VM or natively if needed.</p> <p><img alt="jpg" src="/images/er2017/08-Christophe.jpg" /> The user starts by downloading the base image for his target, then installing the configuration tool. The user configures the base image with a few parameters. The configuration tool merges the prebuilt packages and the user custom code in a new root filesystem image.</p> <p>This image is then stored in the user's repository (update server), and at first boot, the system does an update.</p> <p>Currently, the base image builder works, as well as u-boot and the update shell scripts. The first version of the configuration tool is Qt-based, but it's very ugly according to Christophe. He still wants to improve the tool, and rewrite the base image builder as a Yocto layer. Christophe is looking for contributors and ask anyone interested to contact him.</p> <h2>Wayland is coming</h2> <p>by Fabien Lahoudere</p> <p>Fabien started that he is just a user, not a Wayland developer. Wayland is protocol for compositors to talk to its clients. It's aimed as a simpler replacement for X.</p> <p>Wayland is designed for modern devices, more performant, simpler to use and configure according to Fabien. It's also more secure, supported by toolkits, and the future of Linux distributions. For instance, it prevents keyloggers, that are very easy to implement with X11.</p> <p><img alt="jpg" src="/images/er2017/09-Fabien.jpg" /> Wayland is more performant, because it has less ping/pong between the compositor and the clients. Weston is the reference implementation. It's a minimal and fast Wayland compositor. You can extend it by using libweston. There's also AsteroidOS and Maynard which are two embedded-oriented Wayland compositors.</p> <p>It's also possible to use a "legacy" X application through Xwayland. In fact, Fabien did his whole presentation on a small iMX6Solo based board running evince on top of wayland.</p> <p>Someone from the audience said they recently had to work with QtCompositor, and it was very simple to use.</p> <h2>Process monitoring with systemd</h2> <p>by Jérémy Rosen</p> <p>Jérémy says systemd is a very good tool for embedded systems. It cost about ~6Mb of disk space when built with yocto. It's already integrated in Yocto and Buildroot.</p> <p>systemd makes it easy to secure processes with capabilities, and limits system calls; it can bind mount files to control exactly what a process sees. It makes it easy to control resources with cgroups, as well as monitoring processes.</p> <p>Jérémy compared moving to systemd from init scripts is like going from svn to git. It requires to understand and re-learn a lot of things, but is really worth it in the end.</p> <p><img alt="jpg" src="/images/er2017/10-Jérémy.jpg" /> systemd provide very fine grained control on how to kill a unit: which command to send, which signal to send when it doesn't work, what cleanup command to run, etc. You can define what is a normal or abnormal stop. It can restart an app automatically, and rate-limit this. You can also do coredump management, soft watchdog monitoring, it also monitors itself with a hardware watchdog.</p> <p>A fine-grained integration of how services work interact is also available. You can react to hardware changes, filesystem changes, use socket activation, etc.</p> <p>Jérémy said monitoring is a solved problem for him in embedded and he does not want to work on custom solutions anymore.</p> <p><em>That's it for Embedded Recipes first edition ! Congratulations on reading this far !</em></p> <p><img alt="jpg" src="/images/er2017/11-Speakers.jpg" /></p>awk driven IoT2017-07-05T00:00:00+02:002017-07-05T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2017-07-05:awk-driven-iot.html<p>With a Raspberry Pi or other modern single-board computers, you can make very simple toys. I started with the hello world of interactive apps: the <a href="https://en.wikipedia.org/wiki/Soundboard_(computer_program)">soundboard</a>. But even then, I was too ambitious.</p> <h1>The soundpad</h1> <p>Since the main target was kids. I wanted a simple, screen-less toy that would teach the basics of interactivity, as well as serve as a platform for learning. A simple soundboard can be quite useful to learn animal calls for instance, so I was set.</p> <p>But I also wanted this toy to be wireless and interact with the house. For the first part, I decided to hook an old <a href="https://en.wikipedia.org/wiki/IControlPad">bluetooth game controller</a> I had lying around. I was able to detect its keys with <code>evtest</code> pretty quickly and make an inventory of all buttons keycodes:</p> <div class="highlight"><pre><span></span>304 305 306 307 308 312 313 314 315 316 317 318 </pre></div> <p>For the sounds, I reused the sounds present in the default raspbian scratch installation. There are a few wave files in <code>/usr/share/scratch/Media/Sounds/</code> that proved useful. I made a few directories with symbolic links to the samples I was interested in. Combining <a href="https://joeyh.name/code/moreutils/"><code>vidir</code></a> and the previous keycode list, I ensured each wave file name started with a keycode, like this for the Animal sounds:</p> <div class="highlight"><pre><span></span>304-Bird.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Bird.wav 305-Cricket.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Cricket.wav 306-Crickets.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Crickets.wav 307-Dog1.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Dog1.wav 308-Dog2.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Dog2.wav 312-Duck.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Duck.wav 313-Goose.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Goose.wav 314-Horse.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Horse.wav 315-HorseGallop.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/HorseGallop.wav 316-Kitten.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Kitten.wav 317-Meow.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Meow.wav 318-Owl.wav -&gt; /usr/share/scratch/Media/Sounds/Animal/Owl.wav </pre></div> <p>In order to interact with the house, I paired a bluetooth soundbar to the raspberry pi.</p> <p>Once all of this is setup, this is the entirety of the code for the first iteration of the working soundpad (soundboard + joypad):</p> <div class="highlight"><pre><span></span><span class="ch">#!/bin/bash</span> <span class="nb">cd</span> <span class="nv">$1</span> stdbuf -o0 evtest /dev/input/event0<span class="p">|</span> awk -W interactive <span class="s1">&#39;</span> <span class="s1">/EV_KEY/ { if ( $NF == 1) { system(&quot;paplay &quot; $8 &quot;-*.wav&amp;&quot;) }}&#39;</span> </pre></div> <ul> <li><code>stdbuf</code> is very useful when playing with pipes where the input command is blocking, but you still want interactivity. It allows you to control i/o buffering.</li> <li><code>evtest</code> parses input events.</li> <li><code>awk -W interactive</code> has the same role as <code>stdbuf -i0</code>, but for <code>mawk</code>'s internal buffering (it's not needed for GNU awk).</li> <li>when a matching line is found, <code>paplay</code> is used to play the audio through pulseaudio's bluez sink, that was previously configured as default. The filename corresponds to the button keycode.</li> </ul> <p>The last iteration has the same core code, but with a bit more setup: using <code>bluetoothctl</code> and <code>pactl</code> to make sure the controller and the soundbars are properly connected and configured mainly.</p> <p>It worked, for the most part, but was far from plug-and-play. The soundbar needed to be turned on and put in bluetooth mode. The wireless joypad had to be turned on. It needed constant re-setup of the bluetooth connections, because it lost the pairings regularly. And sometimes the audio would stutter horribly. I tried compiling a more recent version of bluez, to no avail.</p> <p>So after a few day of demos and sample playing, I binned this project about 9 months ago.</p> <h1>The music portal</h1> <p>Fast forward today, I had this thing bothering me about modern music and rhymes for kids. With Deezer &amp; Spotify, we have access to a library we could only dream of. But it's impossible for 2 year old child to operate, or even desirable.</p> <p>Even without online services, the only alternative would be to go back to the audio CDs. But the only functional CD player in our house is the CD-ROM drive in my Desktop computer; I therefore backup all our audio CDs in audio files. Playing those has the same level of complexity (and screen-interaction) as interoperating with streaming services, so it's back to square one.</p> <p>That's where the music portal comes in. It's a combination of a <a href="https://en.wikipedia.org/wiki/Mir:ror">Violet Mir:ror</a> I had lying around, and a Raspberry PI with a speaker hooked up.</p> <p>The Mir:ror is a very simple RFID reader. It's basically plug-and-play on Linux, since it sends raw HID events, with the full ID of the tags it reads, and it has audio and visual feedback. I also evaluated using a <a href="https://en.wikipedia.org/wiki/Skylanders">Skylanders</a> portal, which also sent raw HID events, but its data was much less detailed, with only two bytes of information in the HID events, and the need to do more work to <a href="https://github.com/silicontrip/SkyReader">get the full data</a>, and has no audio or visual feedback.</p> <p>So here is the code of the first version:</p> <div class="highlight"><pre><span></span><span class="ch">#!/bin/bash</span> sudo stdbuf -o0 hexdump -C /dev/hidraw0 <span class="p">|</span> awk -W interactive <span class="s1">&#39;</span> <span class="s1">/02 01 00 00 08 d0 02 1a 03 52 c1 1a 01 00 00 00/ { print &quot;file1 &quot;; play=1; file=&quot;file1.mp3&quot; ; }</span> <span class="s1">/02 01 00 00 08 d0 02 1a 03 52 c1 4b ad 00 00 00/ { print &quot;file2 &quot;; play=1; file=&quot;file2.mp3&quot; ; }</span> <span class="s1">/02 01 00 00 04 3f d7 5f 35 00 00 00 00 00 00 00/ { print &quot;dir1 &quot;; play=1; file=&quot;dir1/*.mp3&quot; ; }</span> <span class="s1">/ 02 02 00 00 0. |01 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00/ { print &quot;stop&quot;; system(&quot;killall -q mpg321&quot;); }</span> <span class="s1">{</span> <span class="s1">if (play) {</span> <span class="s1"> system(&quot;mpg321 -q &quot; file &quot; &amp; &quot;);</span> <span class="s1"> }</span> <span class="s1">play=0 ;</span> <span class="s1">}</span> <span class="s1">&#39;</span> </pre></div> <ul> <li>we use the same <code>stdbuf</code> and <code>awk -W interactive</code> trick as before. Fun fact: I rediscovered this <code>mawk</code> argument by reading the man page while doing this project because I had forgotten about it in only 9 months. I don't think I'll forget it again.</li> <li>Here we're matching full HID event lines. We don't even bother decoding the payload size, etc. Since it all fits on a line matchable by <code>awk</code>.</li> <li>I used <code>mpg321</code> because it has the less footprint when compared to <code>mpg123</code>, <code>gst-launch</code>, <code>mplayer</code>, <code>vlc</code>, and others.</li> <li>I used the same symbolic link structure because it's much easier than putting the full file names in the script.</li> <li>We handle "tag" removal as well as portal shutdown. The Mir:ror automatically shuts down when turned face down.</li> <li>There are race conditions hiding here. It's not a big deal, it's just a prototype.</li> </ul> <p>What could I use after I setup the two included Nanoztags ? I could put RFID stickers on objects; or I could use my visa card; or anything that has an RFID/NFC feature (like my phone). But there are better, available off-the-shelf choices: <a href="https://en.wikipedia.org/wiki/Toys-to-life">toys-to-life</a> like Skylanders ! There are already made for kids, are very sturdy, and I managed to snag a few on clearance at ~1€ a piece !</p> <p>Make sure the Raspberry Pi is connected to your wireless network, so you can add new songs remotely, and throw in a <a href="https://www.freedesktop.org/software/systemd/man/systemd.service.html">systemd.service</a> for automatic starting, and the toy is finished:</p> <div class="highlight"><pre><span></span><span class="k">[Unit]</span> <span class="na">Description</span><span class="o">=</span><span class="s">Music Portal</span> <span class="k">[Service]</span> <span class="na">Type</span><span class="o">=</span><span class="s">simple</span> <span class="na">ExecStart</span><span class="o">=</span><span class="s">/home/pi/musicportal.sh</span> <span class="na">User</span><span class="o">=</span><span class="s">pi</span> <span class="na">Group</span><span class="o">=</span><span class="s">pi</span> <span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">/home/pi</span> <span class="na">StandardOutput</span><span class="o">=</span><span class="s">journal+console</span> <span class="na">StandardError</span><span class="o">=</span><span class="s">journal+console</span> <span class="na">Restart</span><span class="o">=</span><span class="s">always</span> <span class="na">RestartSec</span><span class="o">=</span><span class="s">3</span> <span class="k">[Install]</span> <span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span> </pre></div> <p>And it's truly plug-and-play: you just need to plug the Raspberry Pi, and it powers the speaker through USB, as well as the Mir:ror.</p> <p>Here's a video of the final result:</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/I1Vc38DaTQY" frameborder="0" allowfullscreen></iframe> <p>Last but not least, the title of this article is <em>awk driven <strong>I</strong>oT</em>. So I integrated <a href="https://github.com/plietar/librespot">librespot</a>, and I can now play songs and rhymes from this online streaming service ! Success ✔</p>Go Time2017-02-19T00:00:00+01:002017-02-19T00:00:00+01:00Anisse Astiertag:anisse.astier.eu,2017-02-19:go-time.html<p>For the Go 1.8 Release Party in Paris I gave a lightning talk on monotonic clocks. It's essentially a talk version of the <a href="https://golang.org/design/12914-monotonic">Russ Cox's design document on monotonic clocks</a>, which is really well written and sourced. You should go read it !</p> <p>Here are the <a href="https://anisse.github.io/gotime">slides export</a> and their <a href="http://github.com/anisse/gotime">source</a>.</p>Embedded Linux Conference Europe 20162016-10-21T00:00:00+02:002016-10-21T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2016-10-21:embedded-linux-conference-europe-2016.html<p>I was in Berlin last week for ELCE, and it was great. It was a nice mix of talks on many different subjects, and as always you come back with lots of new ideas and improved motivation.</p> <p>As you know, I took some notes for <a href="kernel-recipes-2016-notes.html">Kernel Recipes 2016</a>, and lots of people at ELCE told me they were glad to have read them. I have since updated the article with videos and LWN links for future readers.</p> <p>Since I was recovering from attending dotGo on monday, and then flying to Berlin, I did not take notes at ELCE this year. But I stumbled during the conference on Arnout Vandecappelle, Buildroot contributor and fellow Embedded Engineer. He took great notes which he posted on <a href="https://mindlinux.wordpress.com/">his company's blog</a>:</p> <ul> <li><a href="https://mindlinux.wordpress.com/2016/10/11/irqs-the-hard-the-soft-the-threaded-and-the-preemptible-alison-chaiken/">IRQs: the Hard, the Soft, the Threaded and the Preemptible – Alison Chaiken</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/11/survey-of-open-hardware-2016-john-hawley-intel/">Survey of Open Hardware 2016 – John Hawley, Intel</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/12/building-a-bards-farm-continuous-integration-and-remote-control-antoine-tenart-quentin-schulz-free-electrons/">Building a Bards Farm: Continuous Integration and Remote Control – Antoine Tenart &amp; Quentin Schulz, Free Electrons</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/12/demystifying-systemd-for-embedded-systems-gustavo-sverzut-barbieri-profusion-embedded-systems/">Demystifying Systemd for Embedded Systems – Gustavo Sverzut Barbieri, ProFUSION Embedded Systems</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/12/running-ubiubifs-on-mlc-nand-richard-weinberger-sigma-star-boris-brezillon-free-electrons/">Running UBI/UBIFS on MLC NAND – Richard Weinberger, sigma star &amp; Boris Brezillon, Free Electrons</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/12/reconfigurable-computing-architecture-for-the-linux-kernel-vince-bridgers-yves-vandervennet-intel-ex-altera/">Reconfigurable Computing Architecture for the Linux Kernel – Vince Bridgers &amp; Yves Vandervennet, Intel (ex-Altera)</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/using-sched_deadline-steven-rostedt-red-hat/">Using SCHED_DEADLINE – Steven Rostedt, Red Hat</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/the-internet-of-things-and-life-beyond-linux-wolfgang-mauerer-technical-university-regensburgsiemens-ag/">The Internet of Things and Life Beyond Linux – Wolfgang Mauerer, Technical University Regensburg/Siemens AG</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/modernizing-the-nand-framework-the-big-picture-boris-brezillon-free-electrons/">Modernizing the NAND Framework: The Big Picture – Boris Brezillon, Free Electrons</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/update-on-shared-logging-between-the-kernel-and-the-bootloader-sean-hudson-mentor-graphics-inc/">Update on Shared Logging between the Kernel and the Bootloader – Sean Hudson, Mentor Graphics, Inc</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/gpio-for-engineers-and-makers-linus-walleij/">GPIO for Engineers and Makers – Linus Walleij</a></li> <li><a href="https://mindlinux.wordpress.com/2016/10/13/open-source-tools-for-fpga-development-marek-vasut-denx-software-engineering/">Open-Source Tools for FPGA Development – Marek Vašut, DENX Software Engineering</a></li> </ul> <p>I attended a few of those talks, and I can tell you he did a great recap.</p> <p>See you next year !</p>Kernel Recipes 2016 notes2016-09-28T00:00:00+02:002016-09-28T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2016-09-28:kernel-recipes-2016-notes.html<p><strong>Update 2016-10-21:</strong> I've added links to the <a href="https://www.youtube.com/playlist?list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm">videos</a> and <a href="https://lwn.net/Archives/ConferenceByYear/#2016-Kernel_Recipes">LWN articles</a>, which are of much higher quality than these live notes.</p> <p>This year I'm trying a live blog of <a href="http://kernel-recipes.org/">Kernel Recipes 2016</a>, live from Paris, at Mozilla's headquarters. You can watch the <a href="https://air.mozilla.org/kernel-recipes-2016-09-28-AM-Session/">live stream here</a>.</p> <h1>The kernel report</h1> <p>by Jonathan Corbet; <a href="https://www.youtube.com/watch?v=9DM3DaQkbGw&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=2">video</a></p> <p>We've had 5 kernel releases since last November, with 4.8 coming out hopefully on Oct 2nd. There were between 12 and 13k changesets for each releases. About 1.5k devs contributed to each release.</p> <p>The number of developers contributing to each release is stable, growing slowly. For each new releases, there are about 200 first-time contributors.</p> <p>The development process continues to run smoothly, and not much is changing.</p> <h2>Security</h2> <p>Security is a hot topic right now. Jon showed an impressive list of CVE numbers, estimating that the actual number of flaws is about double that.</p> <p>The current process for fixing security flaws is like a game of whack-a-mole: there are more and more new flaws, and in the end it's not sure you can keep up.</p> <p>The distributors also aren't playing their part pushing updates to users.</p> <p>So vulnerabilites will always be with us, but what is possible is eliminating whole classes of exploits. Examples of this include: - Post-init read-only memory in 4.6 - Use of GCC plugins in 4.8 - Kernel stack hardening in 4.9 - Hardened usercopy in 4.8 - Reference-count hardening is being worked on.</p> <p>A lot of this originates in grsecurity.net, some of it is being funded by the Core Infrastructure Initiative.</p> <p>The catch is that there are performance impacts, so it's a tradeoff. So can we convince kernel developers it's worth the cost ? Jonathan is optimistic that the mindsets are changing towards a yes.</p> <h2>Kernel bypass</h2> <p>A new trend is to bypass the kernel stack, for instance in the network stack for people doing High-Frequency-Trading.</p> <p>Transport over UDP (TOU) is an example of this, enabling applications to make transport protocols in userspace. The QUIC protocol in Chrome is an example of this.</p> <p>The goal here is to be able to make faster changes in the protocol. For instance, TCP Fast Open has been available for a long time in the kernel, but most devices out there (Android, etc.) have such an old kernel, that nobody is using this.</p> <p>Another goal is to avoid middlebox interference (for example, they mess with TCP congestion, etc.). So here, the payload is "encrypted" and not understood by those middleboxes, so they can't interfere with it.</p> <p>The issue with TOU is that we risk having every app (Facebook, Google, etc.) speaking its own protocol, killing interoperability. So the question is will the kernel still be a strong unifying force for the net ?</p> <h2>BPF</h2> <p>The Berkeley Packet Filter is a simple in-kernel virtual machine. Users can load code in the kernel with the bpf() syscall.</p> <p>It's safe because, there are a lot of rules and limitations to make sure BPF programs do not pose a problem: they can't loop, access arbitrary memory, access uninitialized memory, or leak kernel pointers to user space for example.</p> <p>The original use car of BPF was of course to filter packets. Nowadays it allows system call restriction with seccomm(), perf events filtering, or tracepoint data filtering and analysis. This is finally the Linux "dtrace".</p> <h2>Process</h2> <p>A lot has changed since 2.4. At the time distributors backported lots of code and out-of-tree features.</p> <p>Since then, the "upstream first" rule, or the new regular release (every 10 weeks or so) helped solve a lot of problems.</p> <p>Yet, there are still issues. For instance, a phone running the absolute latest release of Android (7.0), is still running kernel 3.10, which was released in June 2013 and is 221k+ patches behind mainline.</p> <p>So why is this ? Jonathan says that Fear of mainline kernel is a reason. With the rate of change there's the possibility of new bugs and regressions.</p> <p>Jon then showed a table compiled by Tim Bird showing that most phones have a vast amount of out-of-tree code to forward port: between 1.1M and 3.1M lines of inserted codes!</p> <p>Out-of-tree code might be because upstreaming can take a long time. For example, wakelocks or USB changing aren't upstream. Other changes like scheduler rewrites are simply not upstreamable. The kernel moves to slowly for people shipping phones every 6 months.</p> <p>This is a collision of two points of views: manufacturers say that "nobody will remember our product next year", while kernel developers say they've been here for 25 years and intend to continue be here. This is quite a challenge that the community will have to overcome.</p> <h2>GPL enforcement</h2> <p>To sue or not to sue ?</p> <p>Some say that companies will not comply without the threat of compliance. Other say that lawsuits would just shut down any discussions with companies that might become contributors in the future.</p> <p>Contributions stats show that the absolute maximum of independent contributors is about 15%, and that the rest of contributions are coming from people being paid by companies to do so. Therefore alienating those companies might not be the best idea.</p> <p>Corbet put it this way: do we want support for this current device eventually, or do we want support from companies indefinitely ?</p> <h1>entry_*.S : A carefree stroll through kernel entry code</h1> <p>by Borislav Petrov; <a href="https://www.youtube.com/watch?v=f0mEz0XQQMY&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=3">video</a></p> <p>There are a few reasons for entry into the kernel: system calls, interrupts(software/hardware), and architectural exceptions (faults, traps and aborts).</p> <p>Interrupts or exceptions entry need and IDT (Interrupt Descriptor table). The interrupt numbers indexes to it for example.</p> <p>Legacy syscalls had quite an overhead due to segment-based protections. This evolved with the long mode, which requires a flat memory model with paging. Borislav then explains how the setup the MSRs to go into the syscall.</p> <p>The ABI described is x86-specific (which Borislav is a maintainer of), with which registers to setup (rcx, rip, r11) in order to do a long mode syscall. Borislav explains what the kernel does on x86. Which flags should be set/reset ? Read his slides (or the kernel code) for a nice description.</p> <h2>entry_SYSCALL_64 …</h2> <p>… is the name of the function that takes 6 arguments in registers that is run once we're in the kernel. </p> <p>SWAPGS is then called, GS and FS being one of the only segments still used. Then the userspace stack pointer is saved.</p> <p>Then the kenel stack is setup (with a per-cpu-varible) appropriately reading cpu_tss struct.</p> <p>Once the stack is setup, user pt_regs is constructed and handed to helper functions. A full IRET frame is setup in case of preemption.</p> <p>After that the thread info flags are looked at in case there's a special situation that needs handling, like ptraced' syscalls.</p> <p>Then the syscall table is looked at, using the syscall number in RAX. Depending on the syscall needs, it's called more or less differently.</p> <p>Once the syscall has been called, there is some exit work, like saving the regs, moving pt_regs on stack, etc.</p> <p>A new thing on the return path is SYSRET, being faster than IRET which is implemented in microcode (saving ~80ns in syscall overhead). SYSRET does less checks. It depends on the syscall, whether it's on slowpath or fastpath.</p> <p>If the opportunistic SYSRET fails, the IRET is done, after restoring registers and swapping GS again.</p> <p>On the legacy path, for 32-bit compat syscalls, there might be a leak of 16bits of ESP, which is fixed with per-CPU ministacks of 64B, which is the cacheline size. Those ministacks are RO-mapped so that IRET faults are promoted and get their own stack[…].</p> <h1>cgroup v2 status update</h1> <p>by Tejun Heo; <a href="https://www.youtube.com/watch?v=RLqXG4ArPe4&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=4">video</a></p> <p>The new cgroup rework started in Sep 2012 with gradual cleanups.</p> <p>The experimental v2 unified hierarchy support was implemented in Apr 2014.</p> <p>Finally, the cgroup v2 interface was exposed in 4.5.</p> <h2>Differences in v2</h2> <p>The resource model is now consistent for memory, io, etc. Accounting and control is the same.</p> <p>Some resources spent can't be charged immediately. For instance, an incoming packet might consume a lot of CPU in the kernel before we know to which cgroup to charge these resources.</p> <p>There's also a difference in granularity, or delegation. For example, what to do when a cgroup is empty is well defined, with proper notification of the root controllers.</p> <p>The interface conventions have been unified, for example for weight-base resource sharing, the interfaces are consistent accross controllers.</p> <h2>Cpu controller controversy</h2> <p>The CPU controller is still out of tree. There are disagreements around core v2 design features, see <a href="http://lwn.net/Articles/697366/">this LWN article</a> for details.</p> <p>A disagreement comes from page-writeback granularity, i.e how to tie a specific writeback operation to a specific thread as opposed to a resource domain.</p> <p>Another main reason is process granularity. The scheduler only deals with threads, while cgroups don't have thread-granularity, only process-level granularity. This is one of the major disagreements.</p> <p>The scheduler priority control (nice syscall) is a very different type of interface to the cgroup control interface (echo in a file).</p> <p>Discussion on this subject is still ongoing.</p> <h2>The rest</h2> <p>A new pids controller was added in 4.3. It allows controlling the small resource that is the PID space (15 bits) and prevent depletion.</p> <p>Namespace support was added in 4.6, hiding the full cgroup path when you're in a namespace for example. There are still other bugs.</p> <p>An rdma controller is incoming as well.</p> <h2>Userland support</h2> <p>systemd 232 will start using cgroup v2, including the out-of-tree cpu controller. It can use both cgroup v1 and v2 interfaces at the same time.</p> <p>libvirt support is being worked on by Tejun Heo as well, which is currently deploying it with systemd at Facebook.</p> <p>We've had some interesting questions from the audience with regards to some old quirks and security issues in cgroups, but Tejun is quite optimistic that v2 will fix many of those issues and bugs. </p> <p>Old userland tools will probably be broken once cgroup v2 is the norm, but they are fixable.</p> <h1>from git tag to dnf update</h1> <p>by Konstantin Ryabitsev; <a href="https://www.youtube.com/watch?v=vohrz14S6JE&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=5">video</a></p> <p><a href="http://mricon.com/git2dnf">How is the kernel released ? (presentation)</a></p> <h2>Step 1: the git tag</h2> <p>It all starts with a signed git tag pushed by Linus. The transport is git+ssh for the push.</p> <p>It connects to git master, a server in Portland Oregon maintained by the Linux Foundation.</p> <p>The ssh transport passes the established connection to a gitolite shell. gitolite uses the public key of the connection (through an env variable) to identify the user. Then the user talks to the gitolite daemon.</p> <p>Before the push is accepted, a two-factor authentication is done via 2fa-val. This daemon allows the user to validate an IP address for a push. It uses the TOTP protocol. The 2fa token is sent through ssh by the user. It allows the user to authorize an IP address for a certain period of time (usually 24h).</p> <p>Once the push is accepted, gitolite passes control to git for the git protocol transfer.</p> <p>As a post-commit hook, the "grokmirror" software is used to propagate changes to the frontend servers.</p> <p>grokmirror updates a manifest that is served through httpd (a gzipped json file), on a non-publicly accessible server.</p> <p>On a mirror server connected through a VPN, the manifest is checked for changes every 15 seconds, and if there's a change, the git repo is pulled.</p> <p>On the frontend, the git daemon is running, serving updates the repo.</p> <h2>Step 2: the tarball</h2> <p>To generate the tar, the git archive command is used. The file is then signed with gpg.</p> <p>kup (kernel uploader) is then used to upload the tarball. Or it can ask the remote to generate the tarball itself from a given tag, saving up lots of bandwidth. Only the signature is then uploaded. Then the archive is compressed and put in the appropriate public directory.</p> <p>kup uses ssh transport as well to authentify users. The kup server store the tarball in a temporary storage.</p> <p>The tarball is then downloaded by the marshall server, and copied over nfs to the pub master server.</p> <p>The pub master server is mounted over nfs on rasperry pi that watches directory changes and updates the sha256sums file signatures. On marshall, builder server checks if the git tag and tarball are available and then runs pelican to update the kernel.org frontpage.</p> <p>Finally, to publicly get the tarballs, you shouldn't use ftp. It is recommended to use https or rsync, or even https://cdn.kernel.org which uses Fastly.</p> <h1>Maintainer's Don't Scale</h1> <p>by Daniel Vetter; <a href="https://www.youtube.com/watch?v=gZE5ovQq9g8&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=6">video</a>, <a href="https://lwn.net/Articles/703005/">LWN article</a></p> <p><em>I took break here so you'll only find a summary of the talk. <a href="https://kernel-recipes.org/en/2016/talks/maintainers-dont-scale/">Talk description here</a></em> </p> <p>Daniel exposes the new model adopted by the open source intel graphics team to include every regular contributor as Maintainer. His trick ? Give them all commit access.</p> <p>The foreseen problems failed to materialize. Everything now works smoothly. Can this process be applied elsewhere ?</p> <h1>Patches carved into stone tablets</h1> <p>by Greg Kroah-Hartman; <a href="https://www.youtube.com/watch?v=L8OOzaqS37s&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=7">video</a>, <a href="https://lwn.net/Articles/702177/">LWN article</a></p> <p>Why do we use mail to develop the kernel? <a href="https://github.com/gregkh/presentation-stone-tools/blob/master/stone-tools.pdf">presentation</a></p> <p>Because it is faster than anything else. There are 7 to 8 changes per hour. 75 maintainer took on average 364 patches.</p> <p>There are a lot of reviewers.</p> <p>A good person knows how to choose good tools. So Greg reviews a few tools.</p> <p>Github is really nice: free hosting, drive-by contributors, etc. It's great for small projects. The downside is that it doesn't scale for large projects. Greg gives kubernetes as an example: there are 4000+ issues, 500+ outstanding pull requests. Github is getting better at handling some issues, but still requires constant Internet access, while the kernel has contributors that don't have constant Internet access.</p> <p>gerrit's advantage is that project managers love it, because it gives them a sense of understanding what's going on. Unfortunately, it makes patches submissions hard, it's difficult to handle patch series, and doesn't allow viewing a whole patch at once if it touches multiple files. It's slow to use, but it makes local testing hard, people have to work around it with scripts. Finally, it's hard to maintain as a sysadmin.</p> <h2>email</h2> <p>Plain text email has been around since forever. It's what the kernel uses. Everybody has access to email. It works with many types of clients. It's the same tool you use for other types of work. A disadvantage is gmail, exchange, outlook: many clients suck. Gmail as a webserver is good.</p> <p>Read Documentation/email_clients.txt in order to learn how to configure yours.</p> <p>Another advantage of email, is that you don't need to impose any tool. Some kernel developers don't even use git ! Although git works really well with email: it understands patches in mailbox format (git am), and you can pipe emails to it.</p> <p>Project managers don't like it though because they don't see the status.</p> <p>But there's a simple solution: you can simply install Patchwork, which you plug into your mailing list, and it gives you a nice overview of the current status. There's even a command line client.</p> <p>Why does it matter ? Greg says it's simple, has a wide audience, it's scalable, and grows the community by allowing everybody to read and understand how the development process works. And there are no project managers.</p> <p>Kubernetes and docker (github-native projects) are realizing this.</p> <p>Greg's conclusion is that email is currently the best (or less worse?) tool for the job.</p> <h1>Why you need a test strategy for your kernel development</h1> <p>by Laurent Pinchard; <a href="https://www.youtube.com/watch?v=aksNeWAsDPg&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=8">video</a></p> <p>Laurent showed us an example of how a very small, seemingly inconsequential change might introduce quite a bug. There's a need to test everything before submitting.</p> <p>The toolbox used when he started to test his v4l capture driver is quite simple and composed of a few tools ran in the console, in two different telnet connections.</p> <p>He quickly realized that running the commands every time wouldn't scale. After writing a script simplifying the commands, he realized running the script in each of the 5 different terminal connection wouldn't scale either.</p> <p>After this, he automated even further by putting images to be compared in a directory and comparing them with the output. But the test set quickly grew to over a gigabyte of test files.</p> <p>Instead of using static files, the strategy was then to generate the test files on the fly with an appropriate program.</p> <p>He then ran into an issue where the hardware wasn't sending data according to the datasheet. While looking at the data, he discovered he had to reverse engineer how he hardware worked for a specific image conversion algorithm (RGB to HSV).</p> <p>The rule of thumb Laurent advises is to have one test per feature. And to add one test for each bug. Finally, to add a test for each non-feature. For example, when you pass two opposite flags, you should get an error.</p> <p>The test suite Laurent developed is called vsp-tests and is used to test the specific vsp driver he has been working on. There are many other kind of tests in the kernel(selftests, virtual drivers...), or outside of it (intel-gpu-tools, v4l2-compliance, linaro lava tests...).</p> <p>While there are many test suites in the kernel development, there's no central place to run all of these. </p> <p>Regarding CI, the 0-Day project now monitors git trees and kernel mailing lists, performs kernel builds for many architectures, in a patch-by-patch way. On failure it sends you an email. It also runs coccinelle, providing you a patch to fix issues detected by coccinelle. Finally, it does all that in less than one hour.</p> <p>kernelci.org is another tool doing CI for kernel developers. There will be a talk about it on the next day.</p> <p>There's also Mark Brown's build bot and Olof Johansson's autobuilder/autobooter.</p> <p><em>That's it for day one of Kernel Recipes 2016 !</em></p> <h1>Man-pages: discovery, feedback loops and the perfect kernel commit message</h1> <p>by Michael Kerrisk; <a href="https://www.youtube.com/watch?v=TbKtpLHjG1I&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=9">video</a></p> <p><a href="http://man7.org/conf/kr2016/man-pages-discovery-feedback-and-commit-messages-Kernel-Recipes-2016-Kerrisk.pdf">Presentation slides</a></p> <p>Michael has been contributing man pages since around 2000. There around ~1400 pages.</p> <p>When providing an overview, there a few challenges : providing a history of the API, the reason for the design, etc.</p> <p>One of Michael's recent goals has been preventing adding new buggy Linux API. There are a few examples of this. One of the reasons is lack of prerelease testing.</p> <p>There are design inconsistencies, like the different clone() versions. Behavioral inconsistencies might also creep up, like the mlock() vs remap_file_pages() differences in handling pages boundaries.</p> <p>Many process change APIs have different combination of rules for matching credentials of the process that can do the changes.</p> <p>Another issue is long-term maintainability, in which an API must make sure it's extensible, and work on making sure the flags are properly handled, and bad combinations are rejected.</p> <p>We don't do proper API design in Michael's opinion. And when it fails, userspace can't be changed, and the kernel has to live with the problems forever.</p> <h2>Mitigations</h2> <p>In order to fix this, Unit tests are a good first step. The goal is to prevent regressions. But where should they be put ? One of the historical home of testing was the Linux Test Project. But those are out of trees, with only a partial coverage.</p> <p>In 2014, the kselftest project was created, lives in-tree, and is still maintained.</p> <p>A test needs a specification. It turns out specifications help telling the difference between the implementation and intent of the programmer. It's recommended to put the specification at the minimum in the commit message, and at best send a man-page patch.</p> <p>Another great mitigation is to write a real application. inotify is good of example of that: it took Michael 1500 lines of code to fully understand the limitations and tricks of inotify. For example, you can't know which user/program made a given file modification. The non-recursive monitoring nature of inotify also turned out to be quite expensive for a large directory. A few other limitations were find while writing an example program.</p> <p>The main point is that you need to write a real-world application if you're writing any non-trivial API in order to find its issues.</p> <p>Last but not least, writing a good Documentation is a great idea: it widens the audience, allows easier understanding, etc.</p> <h2>Issues</h2> <p>A problem though is discoverability of new APIs. A good idea is to Cc the linux-api mailing list. Michael runs a script to watch for changes for example. It's an issue, because sometimes ABI changes might happen unvoluntarily, while there are a complete no-no in kernel development.</p> <p>Sometimes, we get silent API changes. One example was an adjustment of the posix mq implementation that was discovered years after. By then it was too late to reverse. Of course, this API had no unit tests.</p> <p>The goal to fix this is to get as much feedback as possible <strong>before</strong> the api is released to the public. You should shorten the feedback loop.</p> <h2>Examples</h2> <p>The example of recent cgroup change was given, where improvement of the commit message over the versions gave people a better understanding of the problem that was corrected. It make life easier of the reviewer, for userspace developer, etc.</p> <p>The advice to the developer for a commit message is to assume the less knowledge as possible for the audience. This needs to be done at the beginning of the patch series so many people can give feedback.</p> <p>The last example is from Jeff Layton's OFD locks who did a near perfect API change proposal: well explained, example programs, publicity, man-page patch, glibc patch and even going as far as proposing a POSIX standard change.</p> <p>In response to a question in the audience about the state of process to introduce Linux kernel changes, Michael went as far as to propose that there be a full-time Linux Userspace API maintainer, considering the huge amount of work that needs to be done.</p> <h1>Real Time Linux: who needs it ? (Not you!)</h1> <p>by Steven Rostedt; <a href="https://www.youtube.com/watch?v=4UY7hQjEW34&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=10">video</a></p> <h2>What is Real Time ?</h2> <p>It's not about being fast. It's about determinism. It gives us repeatability, reliability, known worse case scenario and knows reaction time.</p> <p>Hard Real Time is mathematically provable, and has bounded latency. The more code you have, the harder it is to prove.</p> <p>With soft Real Time you can deal with outliers, but have unbounded latency.</p> <p>Examples of hard real time include airplane engine controls, nuclear power plants, etc. Soft real time include a video systems, video games, and some communication systems. Linux is today a Soft Real Time system.</p> <h2>PREEMPT_RT in Linux</h2> <p>It's not a Soft Real Time system because it doesn't allow for outliers or unbounded latency. But it's not Hard Real Time either because it can't be mathematically proven. Steven says it's Hard Real Time "Designed".</p> <p><em>If</em> it had no bug Linux would be a Hard Real Time system. It's used by financial industries, audio recordings (jack), navigational systems.</p> <p><strong>Lots</strong> of feature from PREEMPT_RT has been integrated in the kernel. Examples include highres timers, deadline scheduler, lockdep, ftrace, mostly tickless kernel etc. It allowed people to test SMP-related bugs with only one CPU, since it changed the way spinlocks worked, giving Linux years of advance in SMP performance.</p> <p>But every year PREEMPT_RT also keeps evolving and getting bigger. Missing features still in PREEMPT_RT include Spin locks to sleeping mutexes.</p> <p>Latency always happens. When you have an interrupt, it might run and steal processor time to high priority thread. But with threaded interrupts, you can make sure the "top half" runs for as little time as possible, just to wake up the appropriate thread that will handle the interrupt. </p> <h2>Hardware matters</h2> <p>The hardware needs to be realtime(cache/TLB misses, etc.) as well, but this is topic of Steven's next talk. Come back tomorrow !</p> <h1>kernelci.org : 2 million kernel boots and counting</h1> <p>by Kevin Hilman; <a href="https://www.youtube.com/watch?v=kSe5GMJvqOI&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=11">video</a></p> <p>Kevin showed his growing collection of boards sitting in his home office, that is in fact part of <a href="https://kernelci.org/">kernelci.org</a>.</p> <p>Over the last years, the number of different boards supported by device trees has exploded, while board files have been slowly removed. The initial goal was therefore to test as many boards as possible, while trying to keep up with the growing number of new machines.</p> <p>It started with automation of a small board farm, and then grew into kernelci.org, that builds, boots and reports on the status through web, mail or RSS.</p> <p>Many trees are being tested, with many maintainers requesting that their tree being part of the project.</p> <p>The goal of kernelci.org is to boot those kernel. Building is just a required step. There are currently 31 unique SoCs, across four different architectures, with 200+ unique boards.</p> <p>A goal is to quickly find regressions on a wide range of hardware. Another goal is to be distributed. Anyone having a board board farm can be contributing. There are currently labs at Collabora, Pengutronix, BayLibre, etc. And all of this done in the Open, by small team, none of its member working full-time on it.</p> <p>Once booted a few test suites are run, but no reporting or regression testing is done, and this is only done a small subset of platforms. The project is currently looking for help in visualization and regression detection, since the logs of these tests aren't automatically analyzed. They also would like to have more hardware dedicated to long-running tests.</p> <p>They have a lot of ideas for new features that might be needed, like comparing size of kernel images, boot times, etc.</p> <p>The project is also currently working on energy regressions. The project uses the ARM energy probe and BayLibre's ACME to measure power during boot, tests, etc. The goal is to detect major changes, but this is still under development. Data is being logged, but not reported or analyzed either.</p> <p>How to help ? A good way to start might be just try it, and watch the platforms/boards you care about. The project is looking for contributors in tools, but also for people to automate their lab and submit the results. For the lazy, Kevin says you can just send him the hardware, as long as it's not noisy.</p> <p>Kevin finally showed his schematics to plug many boards, using an ATX power supply, with usb-controled relays and huge USB hubs. The USB port usage explodes since in the ARM space, many boards need USB power supply, and then another USB port for the serial converter.</p> <h1>Debian's support for Secure Boot in x86 and arm</h1> <p>by Ben Hutchings; <a href="https://lwn.net/Articles/703001/">LWN article</a></p> <p>Secure Boot is an optional feature in UEFI that protects against persistent malware if implemented correctly. The only common trusted certificate on PCs are for Microsoft signing keys. They will sign bootloaders on PCs for small fee, but for Windows ARM the systems are completely locked down.</p> <p>For GNU/Linux, the first stage needs an MS signature. Most distribution ship "shim" as a first stage bootloader that won't need updating often.</p> <p>For the kernel, you can use Matthew Garrett's patchset to add a 'securlevel' feature, activated when booted with Secure Boot, that makes module signatures mandatory, and disables kexec, hibernation and other peek/poke kernel APIs. Unfortunately this patch is not upstream.</p> <p>The issue with signatures is that you don't want to expose signing keys to build daemons. You need to have reproducible builds that can't depend on anything secret, therefore you can't auto-build the signed binary in a single step. Debian's solution is to have an extra source package. The first one from which you build the unsigned binary, and a second one in which you put signatures you generated offline.</p> <p>This new package is called <strong>linux-signed</strong>. It contains detached signatures for a given version, and a script to update them for a new kernel version. This is currently done on Ben's machine, and the keys aren't trusted by grub or shim.</p> <p>Signing was added to the Debian archive software dak. This allows converting unsigned binaries to signed ones.</p> <p>While this was already done in Ubuntu, the process is different for Debian (doesn't use Launchpad). Debian signs kernel modules, has detached signatures (as opposed to Ubuntu's signed binaries), and supports more architectures than amd64. Finally, the kernel packages from Ubuntu and Debian are very different.</p> <p>Julien Cristau then came on stage to explain his work on signing with a PKCS#11 hardware security module (Yubikey for now). Signing with an HSM is slow though, so this is only done for the kernel image, not modules.</p> <p>You can find more information the current status of <a href="https://wiki.debian.org/SecureBoot">Secure Boot on the Debian wiki</a>. The goal is to have all of this ready for the stretch release, which freezes in January 2017.</p> <h1>The current state of kernel documentation</h1> <p>by Jonathan Corbet; <a href="https://www.youtube.com/watch?v=UHbq1SzmfUE&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=18">video</a></p> <p>Documentation is unsearchable, and not really organized. There is no global vision, and everything is a silo.</p> <p>Formatted documentation (in source-code) is interesting because it's next to the code. It's generated with "make htmldocs", and is complex multi-step system developed by kernel developers. It parses the source files numerous times for various purposes, and is really slow. The output is ugly, and doesn't integrate with he rest of Documentation/ directory.</p> <p>How to improve this ? Jon says it needs to be cleaned up, while preserving text access.</p> <p>Recently, asciidoc support was added in kernel comments. It has some advantages but adds a dependency on yet-another tool.</p> <p>Jon suggests that it would have been better to get rid of DocBook entirely, and rework the whole documentation build toolchain instead of adding new tools on top.</p> <p>To do this, Jon had a look at Sphinx, a documentation system in Python using reStructuredText. It is designed for documenting code, generating large documents, is widely supported.</p> <p>After posting a proof of concept, Jani Nikula took responsibility and pushed it into a working system. It now supports all the old comments, but also supports RST formatting. To include kerneldoc comments, Jani Nikula wrote an extension module to Sphinx.</p> <p>All this work has been merged for 4.8, and there are now Sphinx documents for the kernel doc HOWTO, GPU and media subsystems.</p> <p>Developers seem to be happy for now, and a new manual is coming in 4.9: Documentation/driver-api is conversion of the device drivers book. Of course, this is just the beginning, as there are <em>lots</em> of files to convert to the new format, and Jon estimates this might take years until it's done.</p> <p>For 4.10, a goal would be to consolidate the development process docs (HOWTO, SubmittingPatches, etc.) into a document. The issue here is that some of this files are really well-known, and often pointed-to, and this would break a lot of "links" in a way.</p> <h1>Landlock LSM: Unprivileged sandboxing</h1> <p>by Mickaël Salaün; <a href="https://www.youtube.com/watch?v=OJ9LuNEP-D8&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=12">video</a>, <a href="https://lwn.net/Articles/703876/">LWN article</a></p> <p>The goal of landlock is to restrict processes without needing root privileges.</p> <p>The use case is to be used by sandboxing managers (flatpak for example).</p> <p>Landlock rules are attached to the current process via seccomp(). They can also be attached to a cgroup via bpf()</p> <p>Mickaël then showed a demo of the sandboxing with a simple tool limiting the directories a given process can access.</p> <p>The approach is similar to Seatbelt or OpenBSD Pledge. It's here to minimize the risk of sandbox escape and prevent privilege escalation.</p> <p>Why do existing features do no fit with this model ? The four other LSMs didn't fit the needs because they are designed to be controlled by the root/admin user, while Landlock is accessible to regular users.</p> <p>seccomp-BPF can't be used because it can't filter arguments that are pointers, because you can't dereference userland memory to have deep filtering of syscalls.</p> <p>The goal of Landlock is to have a flexible and dynamic rule system. It of course has hard security constraints: it aims to minimize the attack surface, prevent DoS, and be able to work for multiple users by supporting independent and stackable rules.</p> <p>The initial thought was to extend the seccomp() syscall, but then it was switch to eBPF. The access rules are therefore sent to the kernel with bpf().</p> <p>Landlock uses LSM hooks for atomic security checks. Each rule is tied to one LSM hook. It uses map of handles, a native eBPF structure to give rules access to kernel objects. It also exposes to eBPF rules filesystem helpers that are used to handle tree access, or fs properties (mount point, inode, etc.).</p> <p>Finally, bpf rules can be attached to a cgroup thanks to a patch by Daniel Mack, and Landlock uses this feature.</p> <p>Rules are either enforced with the process hierarchy, with the seccomp() interface to which Landlock adds a new command; or via cgroups for container sandboxing.</p> <p>The third RFC <a href="http://lwn.net/Articles/700607/">patch series for Landlock is available here</a>.</p> <h1>Lightning talks</h1> <h2>the Free Software Bastard Guide</h2> <p>by Clement Oudot; <a href="https://www.youtube.com/watch?v=yHjtjoqE7V0&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=13">video</a></p> <p>This is a nice compilation of things not to do as user, developer or enterprise. While the talk was very funny, I won't do you the offense of making a summary since I'm sure all my readers are very disciplined open source contributors.</p> <p><a href="http://fr.slideshare.net/coudot/kr2016-the-free-software-bastard-guide">(slides)</a></p> <h2>Mini smart router</h2> <p>by Willy Tarreau; <a href="https://www.youtube.com/watch?v=hzvVtV_zmHw&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=14">video</a></p> <p>This is about a small device made by Gl-inet. It has an Atheros SoC (AR9331) with a MIPS processor, 2 ethernet ports, wireless, 64MB of RAM and 16MB of flash.</p> <p>The <a href="http://www.haproxy.com/download/aloha/pocket/">documentation and sources for the Aloha Pocket</a>, a small distro running on the hardware.</p> <h2>Corefreq</h2> <p>by Cyril</p> <p>Corefreq measures Intel CPUs frequencies and states. It gives you a few hardware metrics. You can lean more on <a href="https://github.com/cyring/CoreFreq">Corefreq github page</a>.</p> <p><em>That's it for day two of Kernel Recipes 2016 !</em></p> <h1>Speeding up development by setting up a kernel build farm</h1> <p>by Willy Tarreau; <a href="https://www.youtube.com/watch?v=vwQ-KcjskRw&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=15">video</a>, <a href="https://lwn.net/Articles/702375/">LWN article</a></p> <p>Some people might spend a lot of time building the Linux kernel, and this hurt the development cycle/feedback loop. Willy says during backporting sessions, the build time might dominate the development time. The goal here is to reduce the wait time.</p> <p>In addition, build times are often impossible to predict when you might have an error in the middle breaking the build.</p> <p>Potential solutions include, buying a bigger machine or using a compiler cache, but this does not fit Willy's use case.</p> <p>Distributed building is the solution chosen here. But as a first step, this required a distributed workload, which isn't trivial at all for most project. Fortunately, the Linux kernel fits this model.</p> <p>You need to have multiple machines, with the exact same compiler everywhere. Willy's proposed solution is to build the toolchain yourself, with crosstool-ng. You then combine this with distcc, which is a distributed build controller, with low overhead.</p> <p>Distcc still does the preprocessing and linking steps locally, which will consume approx 20% to 30% of the build time. And you need to disable gcov profiling.</p> <p>In order to measure efficiency of a build farm, you need to compare performance. This requires a few steps to make sure the metric is consistent, as it might depend on number of files, CPU count, etc. Counting lines of code after preprocessing might be a good idea to have a per-line metric. </p> <h2>Hardware</h2> <p>In order to select suitable machines, you first need to consider what you want to optimize for. Is it build performance at given budget, number of nodes, or power consumption ?</p> <p>Then, you need to wonder what impacts performance. CPU architecture, DRAM latency, cache sizes and storage access time are all important to consider. </p> <p>For the purpose of measuring performance, Willy invented a metric he calls "BogoLocs". He found that dram latency and L3 cache are more important for performance than CPU frequency.</p> <p>To optimize for performance, you must make sure your controller isn't the bottleneck: its CPU or network access shouldn't be saturated for instance.</p> <p>PC-type machines are by far the fastest, with their huge cache and multiple memory channels. However, they can be expensive. A good idea might be to look at gamer-designed hardware, that provides the best cost-to-performance ratio.</p> <p>If you're optimizing for a low number of nodes, buy a single dual-socket high-frequency, x86 machine with all RAM slots populated.</p> <p>If you're optimizing for hardware costs, a single 4-core computers can cost $8 (NanoPi). But there are a few issues: there are hidden costs (accessories, network, etc.), it might be throttled when heating, some machines are unstable because of overclocking, while only achieving up to 1/16th performance of a $800 PC.</p> <p>You can also look at mid-range hardware (NanoPI-M3, Odroid C2), up to quad-core Cortex A9 at 2GHz. But then they run their own kernel. "High-range" low cost hardware are often sold as "set-top-boxes" (MiQi, RKM-v5, etc.) Some of these can even achieve 1/4th performance of a $800 PC. But there are gotchas as well, with varying build quality, high power draw, thermal throttling.</p> <p>The MiQi board at $35 is Willy's main choice according to his performance measurements (or its CS-008 clone). It's an HDMI dongle that can be opened and used barebones. You don't need to use a big linux distribution, a simple chroot is enough for gcc and distcc.</p> <p><a href="http://wiki.ant-computing.com/Choosing_a_processor_for_a_build_farm">All the data from this presentation is on a wiki</a>.</p> <h1>Understanding a real-time system: more than just a kernel</h1> <p>by Steven Rostedt; <a href="https://www.youtube.com/watch?v=w3yT8zJe0Uw&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=16">video</a></p> <p>Real-time is hard. Having preempt-rt patched kernel, is far from enough. You need to look at the hardware under, and the software on top of it, and in general have holistic view of your system.</p> <p>A balance between a Real-Time system versus a "Real-Fast" system needs to be found.</p> <p>You have to go with a Real-Time hardware if you want a real-time system. It's the foundation, and if you don't have it, you can forget about your goal.</p> <h2>Non-real-time hardware features</h2> <p>Memory cache impacts determinism. One should find the worst-case scenario, by trying to run without the cache.</p> <p>Branch prediction misses can severely impact determinism as well.</p> <p>NUMA, used on multi-CPUs hardware, can cause issues whenever a task tries to access memory from a remote node. So the goal is to make sure a real-time task always uses local memory.</p> <p>Hyper-Threading on Intel processors (or AMD's similar tech) is recommended to be disabled for Real-Time.</p> <p>Translation Lookaside Buffer is a cache for page tables. But this means that any miss would kill determinism. Looking for the worst-case scenario during testing by constantly flushing the TLB is needed for a real-time system.</p> <p>Transactional Memory allows for parallel action in the same critical section, so it makes things pretty fast, but makes the worst case scenario hard to find when a transaction fails.</p> <p>System Management Interrupt (SMI), puts the processor in System Management Mode. On a customer box, Steven was able to find that every 14minutes, an interrupt was eating CPU time, that was in fact an SMI for ECC memory.</p> <p>CPU Frequency scaling needs to be disabled (idle polling), while not environmental friendly, it's a necessity for determinism. </p> <h2>Non-real-time software features</h2> <p>When you're using threaded interrupts, you need to be careful about priority, especially if you're waiting for important interrupts, like network if you're waiting for data.</p> <p>Softirqs need to be looked at carefully. They are treated differently in PREEMPT_RT kernels, since they are run in the context of who raises them. Except when they are raised by real Hard interrupts like RCU or timers.</p> <p>System Management Threads like RCU, watchdog or kworker also need to be taken into account, since they might be called as side-effect of a syscall required by the real-time application.</p> <p>Timers are non-evident as well and might be triggered with signals, that have weird posix requirements, making the system complex, also impacting determinism.</p> <p>CPU Isolation, whether used with the isolcpus kernel command line parameter, or with cgroup cpusets can help determinism if configured properly.</p> <p>NO_HZ is good for power management thanks to longer sleeps, but might kill latency since coming out of sleep can take a long time, leading to missed deadlines.</p> <p>NO_HZ_FULL might be able to help with real-time once ready, since it can keep the kernel from bothering real-time tasks by removing the last 1-second tick.</p> <p>When writing an RT Task, memory locking with mlockall() is necessary to prevent page fault from interrupting your threads. Enabling priority inheritance is a good idea to prevent some types of locking situations leading to unbounded latency.</p> <h1>Linux Driver Model</h1> <p>by Greg Kroah-Hartman; <a href="https://www.youtube.com/watch?v=AdPxeGHIZ74&amp;list=PLQ8PmP_dnN7L5OVT95uXJAE78qcGCcDVm&amp;index=17">video</a></p> <p><a href="https://github.com/gregkh/presentation-driver-model">Presentation files</a> </p> <p>Greg says nobody needs to know about the driver model.</p> <p>If you're doing reference counting, use struct kref, it handles lots of really tricky edge cases. You need to use your own locking though.</p> <p>The base object type is struct kobject, it handles the sysfs representation. You should probably never use it, it's not meant for drivers.</p> <p>On top of that struct attribute provides sysfs files for kobjects, also to never be managed individually. The goal here is to have only one text or binary value per file. It prevents a problem seen in /proc where multiple values per file broke lots of applications when values were added, or unavailable.</p> <p>kobj_type handles sysfs functions, namespaces, and release().</p> <p>struct device is the universal structure, that everyone sees. It either belongs to a bus or a "class".</p> <p>struct device_driver handles a driver that controls a device. It does the usual probe/remove, etc.</p> <p>struct bus_type binds devices and drivers, matching, handles uevents and shutdown. Writing a bus is a complex task, it requires at least 300 lines of code, and has lots of responsibilities, with little helper functions.</p> <p>Creating a device is not easy either, as you should set its position in the hierarchy (bus type, parent), the attributes and initialize it in two-step way to prevent race conditions.</p> <p>Registering a driver is a bit simpler (probe/release, ownership), but still complex. struct class are userspace-visible devices, very simple to create (30-40 lines of code). A class has a lot of responsibilities, but most of those are handled by default, so not every driver has to implement them.</p> <p>Greg says usb is not a good example to understand the driver model, since it's complex and stretches it to its limits. The usb2serial bus is good example.</p> <p>The implementation relies on multiple levels of hierarchy, and has lots of pointer indirections throughout the tree in order to find the appropriate function for an operation (like shutdown())</p> <p>Driver writers should only use attribute groups, and (almost) never called sysfs_*() functions. Greg says you should never use platform_device. This interface is abused of using a real bus, or the virtual bus.</p> <p>Greg repeated that raw sysfs/kobjects should never be used.</p> <h1>Analyzing Linux Kernel interface changes by looking at binaries</h1> <p>by Dodji Seketeli; <a href="https://lwn.net/Articles/703890/">LWN article</a></p> <p>What if we could see changes in interfaces between the kernel and its modules just by looking at the ELF binaries ? It would be a kind of diff for binaries, and show changes in meaningful way.</p> <p>abidiff already does <em>almost</em> all of this userspace binaries. It builds an internal representation of an API corpus, and can build differences. Dodji shows us here how does abidiff works. </p> <p>Unfortunately, there's nothing yet for the Linux Kernel. Dodji entertains the idea of a "kabidiff" tool that would work like abidiff, but for the Linux kernel.</p> <p>For this to work, it would need to handle special Linux ELF symbol sections. For instance, it would restrain itself to "__export_symbol" and "__export_symbol_gpl" sections. It would also need to support augmenting an ABI corpus with artifacts from modules.</p> <p>In fact, work on this has just started in the dodji/kabidiff branch of libabigail.git.</p> <h1>Video color spaces</h1> <p>by Hans Verkuil; <a href="https://lwn.net/Articles/703142/">LWN article</a></p> <p>struct v4l2_pix_format introduced in kernel 3.18 is the subject of the talk.</p> <p>Hans started by saying that Color is an illusion, interpreted by the brain.</p> <p>A colorspace is actually the definition of the type of light source, where the white point is, and how to reproduce it.</p> <p>Colorspaces might be linear, but neither human vision or early cRTs were. So to convert from a linear to non-linear colorspace, you define a transfer function.</p> <p>In video, we otfen use the Y'CbCr (YUV) colorspace. To convert to and from RGB is possible. You can represent all colors in all colorspaces, as long as you don't do quantization (cut of values &lt;0 and &gt;1), which is why you should always do it last.</p> <p>There are a few standards to describe colorspaces: Rec 709, sRGB, SMPTE 170M, and lately BT 2020 used for HDTVs.</p> <p>Typically, colorspace names might be confusing, the conversion matrices might be buggy, and applications would just ignore colorspace information. Sometimes, hardware uses a bad transfer function.</p> <p>In the end Hans found that only half of the vl2_pix_format structure fields were useful.</p> <p>Hans showed examples of the difference of transfer functions between SMPTE 170M and Rec.709. The difference between Rec. 709 and sRGB, or betweer Rec.709 and BT.601 Y'CbCr is more visible. Those example would be impossible to see on a projector, but luckily the room at Mozilla's has huge LCD screens. But even there, it's not enough, since with Full/Limited Range Quantization, a light grey color visible on Hans' screen, was simply white while displayed on the big screen and recording stream. Some piece of the video chain was just doing quantization "bad".</p> <h1>State of DRM graphics driver subsystem</h1> <p>by Daniel Vetter; <a href="https://lwn.net/Articles/703574/">LWN article</a></p> <p>The Direct Rendering Management (drm) subsystem is slowly taking over the world.</p> <p>Daniel started by saying that the new kerneldoc toolchain (see above talk by Jonathan Corbet) is really nice. Everything with regards to the new atomic modesetting is documented. Lots of docs have been added.</p> <p>Some issues in the old userspace-facing API are still there. Those old DRI1 drivers can't be removed, but have been prefixed with drm_legacy_ and isolated.</p> <p>The intel-gpu-tools tests have been ported to be generic, and are starting to get used by on many drivers. Some CI systems have been deployed, and documentation added.</p> <p>The open userspace requirement has been documented: userspace-facing api in DRM kernel code requires an open source userspace program. </p> <p>Atomic display drivers have allowed flicker-free modesetting, with check/commit semantics. It has been implemented because of hardware restrictions. It also allows userspace to know in advance if a given modification would be possible. You can then write userspace that can try approaches, without becoming too complex.</p> <p>20 drivers and counting have been merged with an atomic interface, which 2 or 3 per release, as opposed to one per year (1 per 4 or 5 releases) in the 7 years before atomic modesetting. There's a huge acceleration in development, driving lots of small polish, boiler-plate removals, documentation and new helpers.</p> <p>There's a bright future, with the drm api being used in android, wayland, chromeos, etc. Possible improvements include a benchmark mode, or more buffer management like android's ion.o</p> <p>A generic fdbev implementation has been written on top of KMS.</p> <p>Fences are like struct completion, but for DMA. Implicit fences are taken care of by the kernel. Explicit fences can be passed around by userspace. Fences allows synchronisation between components of video pipeline, like a decoder and an upscaler for example.</p> <p>With upcoming explicit fencing support in kernel and mesa, you can now run Android on upstream code, with graphics support.</p> <p>The downside right now is the bleak support of rendering in open drivers. There are 3 vendor-supported, 3 reverse-engineered drivers, and the rest is nowhere to be seen.</p> <h1>The new hwmon device registration API</h1> <p>by Jean Delvare</p> <p>The hwmon subsystem is used for hardware sensors available in every machine, like temperature sensors for example.</p> <p>hwmon has come a long way. 10 years go, it became unmaintanable, with lots of device-specific code in userspace libraries.</p> <p>The lm-sensors v2 in 2004 was based on procfs for kernel 2.4, and sysfs for kernel 2.6.x.</p> <p>In 2006, there was no standard procfs interface. Therefore, for lm-sensors v3, a documentation was written, standards were enforced, and the one-value per sysfs file rule was adopted. No more device-specific code in libsensors and applications was allowed. Support for new devices could finally be added without touching user-space.</p> <h2>kernel-space</h2> <p>Once the userspace interface was fixed, it did not mean the end of the road.</p> <p>It turned out that every driver implemented its own UAPI. So in 2005, a new hwmon sysfs class was submitted. It was quite simple, and all drivers were converted to the new subsystem at once.</p> <p>It worked for a while, but wasn't sufficient. In 2013, a new hwmon device registration API was introduced: hwmon_register_with_groups. It gives the core flexibility, and allows it to validate the device name. Later this year a new API was added to help unregister and cleanup.</p> <p>Finally, in July 2016 a new registration API proposal was proposed, moving hwmon attributes in core, and doing the heavy lifting of setting up sysfs properly. This patchset is still under review and discussion. Driver conversion won't be straightforward at all, but still deletes more code.</p> <p>In conclusion, a good subsystem should help drivers, integrate well into the kernel, and offer a standard interface. It should provide a smaller binary size and have fewer bugs. But there are still concerns with regards to performance issues, and added complexity because of too many registration functions.</p> <p><em>That's it for Kernel Recipes 2016 ! Congratulations if you managed to read everything !</em></p>Making a Twitter bot that looks for hashes2016-09-09T18:00:00+02:002016-09-09T18:00:00+02:00Anisse Astiertag:anisse.astier.eu,2016-09-09:making-a-twitter-bot-that-looks-for-hashes.html<p><em>This is a followup to <a href="what-do-you-find-when-you-search-twitter-for-hashes.html">What do you find when you search Twitter for hashes ?</a></em></p> <h1>Why ?</h1> <p>I'm not sure I remember how it started.</p> <p>It all started four years ago. Jon Oberheide was still an independent security researcher and not yet CTO of a successful product company. He posted some hashes on twitter. I was perplex at first but then I quickly understood that it was to serve as a proof in case someone disputed his research's finding later (and the timing at which he found his results). He was posting hash proofs. And then Matthew Garrett did it too.</p> <p>At the time Twitter wasn't very reliable for accessing old tweets (they vastly improved). I thought maybe by finding these hash proofs and indexing them, we could serve as an independent verifier. Nowadays all the kids put their hashes in the Bitcoin blockchain, and there are even services to do it from your browser.</p> <h2>So how to do it ?</h2> <p>This ought to be easy, right? The initial idea was to just do a simple search of random characters in the hexadecimal space, and hope that they are in hashes ? Well, not really. At first I thought it could be done, but it can't, because twitter search only works on full words, since it's tokenizing for indexing purposes. Which means you can't search for part of words hoping to stumble upon hashes. So much for using n-grams.</p> <p>Therefore, I had to use the public sample stream, and filter <em>every</em> tweet in order to find relevant ones.</p> <h2>Firehose ? Not likely.</h2> <p>Twitter has a special stream that contains all the tweets being posted, called "Firehose". Few people get access to it. There are two other streams: Gardenhose, containing 10% on the tweets, and Spritzer, the sample stream containing 1% of the tweets. The bot currently runs on Spritzer, and Gardenhose was requested, but I never got an answer. It's part of the <a href="http://gnip.com">monetization</a> <a href="http://support.gnip.com/apis/firehose/overview.html">strategy</a>. No place here for hacker/hobbyists.</p> <p>So only 1% of tweets(I have tried to verify that with other public data, it seems about right despite my initial thoughts) that's why the bots haven't been talking much together yet. It also means there's a 99% chance of missing your tweet. And that development iteration speed is a hundred time slower.</p> <h1>How does it work ?</h1> <p>The initial version used a naive regex, but had too many false positives, from repeated characters, to magnet links of P2P files. Now it's much harder to match.</p> <p>The regex is currently matching MD5, SHA1, SHA256 and SHA512 sizes. Most uses are covered.</p> <p>I added a naive exclusion filter (all letters or all numbers), which might not detect extremely well crafted hashes a researcher might be working on. This is out of scope for hashproofs, the anti-spam measures are already pretty strong and might miss interesting content.</p> <h2>Current approach</h2> <p>The first stage is a simple regex <code>[a-f0-9]{32,128}</code> . I wanted it as simple as possible because it is run on every tweet, and should be as fast as possible.</p> <p>The second stage is a much more complex regex (harder to match), with specific sizes of various hashes.</p> <p>Then there are lots of manually crafted filters to fight off spam. Blocked keywords. Users banned automatically. Embedded images and most links are blocked.</p> <p>Finally there is entropy measurement, making sure we have a hash and not a mindless series of characters.</p> <h2>Performance research</h2> <p>To improve performance, I built-in quite a few tools. For example, there's a command allowing to dump the sample stream in temporary file (that you're not allowed to keep). This file is then used to measure performance in a repeatable fashion (there's no contradiction here, right ?), and isolated from the network.</p> <p>I implemented different version of the core line processing, some of which are still in the tests. I was trying to see how to speed up the code. But after some profiling, I realized that most of the time was spent in json processing. Moving to ultrajson(ujson) cut the processing time by 5, compared to python2's cjson module.</p> <h2>Bot detection and spam fighting algorithm</h2> <p>What I did was initially mostly manual: keyword based, username and client based. I kept adding new keywords and banning new clients, but it didn't scale.</p> <p>I then implemented an analysis of a match users's timeline. Within the last 200 tweets, if it had more than 5% of hashes, it was probably a bot. It greatly cut the spam at first, and since it's implementation in 2013 has detected 14k+ accounts posting more that 5% of hashes, and 2.7k+ accounts posting more 50%.</p> <p>There was still a LOT of things passing through (including porn). But the strategy is to use automatic (algorithmic) filtering, not manual. I had to resolve to blocking most outgoing URLs, meaning ther's nothing to spam for. I had to filter tweets containing images.</p> <p>Earlier this year, I discovered a spam network selling followers used the new Twitter Cards to embed links &amp; images without having an URL in the tweet, so I added a filter for that too. For some reason, they were posting lots of hashes. Maybe adding entropy helps circumvent Twitter's detection systems.</p> <h2>Challenges</h2> <p>The code is not py3k compatible for historical reasons (used to need requests-oauth, but moved since to requests-oauthlib (which at some point was inside requests)), although I love py3k. I also had to use ur"" strings, which were ported in python 3.3, which wasn't available at the time. The porting shouldn't be very hard.</p> <p>It was very hard to deal with twitter intermittent service. I developed a watchdog specifically to detect hangs, and then auto-restart. It's the easy way out, but has allowed the bot to work quite well, with months-long uptimes between the updates.</p> <p>As I said earlier, it's hard to debug with a very slow stream that make errors appear a hundred times more slowly.</p> <p>Finally, this "light" stream means there's a 99% chance of missing your tweet. Unless you have lot of followers that RT you, but then you don't need hashproofs, do you ?</p> <h2>Potential improvements</h2> <ul> <li>Follow user stream and watch for hashes. The bot already auto follows people below a certain rate already for good potential feed.</li> <li>use a hashtag (e.g #hashproof) that security researcher can use so that their important tweets are seen.</li> </ul> <h2>Gimme the code, gimme the data</h2> <p>Today I'm publishing the source code for <a href="https://github.com/anisse/hashbot">hashbot on Github</a>. The data is available there as well and <a href="what-do-you-find-when-you-search-twitter-for-hashes.html">analyzed in the earlier article</a>.</p> <h1>Who noticed ?</h1> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/mjg59/status/189781717698101248"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/ochsff/status/189793467872968705"></a></blockquote> <p>I actually implemented Georg's suggestion and all hashes were entropy checked after this.</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/bcrypt/status/486855164708810754"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/blinry/status/679276045229563905"></a></blockquote> <p>Yeah, spam was this bad (and still is to an extent).</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/opsecanimals/status/733577781238366208"></a></blockquote> <p>It was also noticed by @adulau <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/adulau/status/773831987127787521"></a></blockquote> He asked about the code. Which is why you're seeing this here today.</p> <h1>A few successful findings</h1> <p>There out to be some after all ? Here are a few:</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/ioerror/status/365860766517174272"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/OpenCryptoAudit/status/476195439323406336"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/mikko/status/260827290404020224"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/PurpleTeamiOS/status/442341601072136192"></a></blockquote> <h1>Lessons from the project</h1> <p>Always test, makes for robust code.</p> <p>Always benchmark, you might have surprises, cf ultrajson that gave 5x performance speed up.</p> <p>A watchdog is essential when interacting with an external, long-lived service. Twitter has been stopping the stream while keeping the TCP socket open many times, which would mean a hang of the bot.</p> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>What do you find when you search Twitter for hashes ?2016-09-09T17:00:00+02:002016-09-09T17:00:00+02:00Anisse Astiertag:anisse.astier.eu,2016-09-09:what-do-you-find-when-you-search-twitter-for-hashes.html<p>This image:</p> <p><img alt="jpg" src="/images/accents-of-blue_small.jpg" /></p> <p>This is what I found with <a href="https://anisse.astier.eu/making-a-twitter-bot-that-looks-for-hashes.html">hashbot, a twitter bot that looks for hashes.</a></p> <h2>What is this image ?</h2> <p>Posted with the hash "2f404a288d1b564fadee944827a39a14" by japanese accounts (of which @furueru_zekkei used to be the top poster, now suspended).</p> <p>After a bit of research on google images and more, I found that this image is a photo of the <a href="http://www.dpchallenge.com/image.php?IMAGE_ID=186442">White Desert in New Mexico, by Greg Riegler</a>. This might or might not be the same Greg Riegler as <a href="http://gregriegler.com/">here</a>.</p> <h2>Why is this ?</h2> <p>Bots. There a lots of them. The Internet is made of bots.</p> <p>This is what you were most likely to find until 2015 (with a 10% chance).</p> <p>How do I know that ? Well I searched. But this is a story for another post.</p> <h1>What else do you find ? Bots bots bots.</h1> <p>Along this, I found many japanese bots mentionning @null</p> <p>Porn posting bots. The internet is made of them. For some reasons they post hashes... maybe to make sure their tweets are unique and not detected as a spam network ?</p> <p>Occasionnal git and mercurial commit IDs.</p> <p>Security researcher posting proof-of-work. This was the initial motivation behind <a href="https://twitter.com/hashproofs">hashproofs</a>.</p> <p>iPhone UDIDs. Apparently there's a 'market' on Twitter between devs and users to enable iPhones with beta builds:</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/Fiddop/status/344225558365876225"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/UsmanUP/status/469233226679353344"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/fmiquiza/status/759891924912254976"></a></blockquote> <p>Giveaway of various activation codes for games, digital products.</p> <p>People crowd-sourcing password hashs, and bots <a href="https://twitter.com/PlzCrack">running rainbow table queries</a>.</p> <p>Bitcoin transaction IDs:</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/MikeBeas/status/441866558382821376"></a></blockquote> <p>Torrent hashes:</p> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/corentin_stn/status/354294100431876096"></a></blockquote> <p>Some things just impossible to understand: <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/IvaniaDelRey/status/250016102892072960"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/lakeeffect_kid/status/233214857716060160"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/aperthure/status/242521145868439553"></a></blockquote> <blockquote class="twitter-tweet" data-cards="hidden" data-lang="en"><a href="https://twitter.com/Rueangritufo/status/269656715690115072"></a></blockquote></p> <p>LOTS of bots posting more than 5% of tweets containing hashes (found a lot) These won't appear in the results, but <a href="https://github.com/anisse/hashbot/blob/master/results/banned-usernames-2016-09-09">here is the list</a>.</p> <p>I realize how ironic it is to criticize Twitter for having a lot of bots, because the same conditions that allowed all these bots (the API), also permitted this research (as compared to a scraping bot that would have to be updated more often). Of course, hashproofs isn't really spamming, and just acts as a "curator", and does a job that would be impossible to do for a human (i.e analyzing lots of tweets/s).</p> <p>The full list of results can be found on <a href="https://twitter.com/hashproofs">hashproofs' Twitter feed</a>.</p> <h2>Give me the data</h2> <p>I published the <a href="https://github.com/anisse/hashbot">code on Github</a> and the <a href="https://github.com/anisse/hashbot/tree/master/results">full results of the four-year research</a>. (WARNING: contains spam and porn links)</p> <p>This should give you the full data you need to re-analyze the results or run you own hashbot instance (with a better algorithm? or access to a better stream ?)</p> <h1>Unveiling a few bot networks</h1> <p>As I explained earlier, hashproofs analyzes the timeline of users for every matching tweets. If the percentage of matching tweets they have is above a certain arbitrary level (5%), the username is banned locally. If it's over 50%, the account is blocked. That's why you'll find two different lists in the results. One is from Twitter, listing the ids of blocked account. The other is the content of the "banlist" state file of the bot.</p> <p>By analyzing the list of blocked users, I found a few legitimate bots (e.g posting commits on twitter, running rainbow tables, see earlier). I also found a lot of spam bots, some of which were taken car of by twitter. I also discovered that spammers tend to rename their accounts, and my younger self only thought of tracking the usernames, not the account ids, so that's why you'll see discrepancies if you try to have the two lists match.</p> <p>You'll also see that even regular users rename their account if you look at historical data from 2014.</p> <p>Here are a few excerpt from the banlist that show twitter handles that I doubt have been created by legitimate users:</p> <div class="highlight"><pre><span></span> 3924fe95e2cd5f8 68c59dbbb15c5a4 6298c2a08ef9b3b a33262acc8e5c77 b2dc44d67994d44 21332a575639f58 […] Cloud404aa cloud405aa cloud406aa cloud407aa […] 000xxx_6wy 000xxx_897 000xxx_dr3 […] Death_ldo Death_y7s Death_mew Death_ojy </pre></div> <p>All of those are in sequence, which means they were detected by hashproofs one after the other. There are many other examples like this if you want to look at all the 14k+ automatically banned handles.</p> <p>If you're interested in the <a href="making-a-twitter-bot-that-looks-for-hashes.html">historical and technical details, read on to the following article</a>.</p> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>Unofficial witty cloud module documentation with nodemcu firmware2016-02-25T00:00:00+01:002016-02-25T00:00:00+01:00Anisse Astiertag:anisse.astier.eu,2016-02-25:unofficial-witty-cloud-module-documentation-with-nodemcu-firmware.html<p>I wanted to try my hand with ESP8266 modules, so I got <a href="http://www.aliexpress.com/item/ESP8266-serial-WIFI-Witty-cloud-Development-Board-ESP-12F-module/32557886226.html">a witty cloud development board</a>. It's running a proprietary firmware from gizwits which I <a href="/static/witty-flash-backup.bin.gz">backed up</a> if anyone wants to look at it.</p> <p>The board is in two parts: programming board("cape") with ch340g usb serial and 3.3V converter (plus flash and reset buttons); and main board with the esp module, ams1117 3.3V voltage regulator, a button, a blue led, an rgb led, and a light sensor(photo resistor). All this for the <a href="http://www.aliexpress.com/item/ESP8266-serial-WIFI-Witty-cloud-Development-Board-ESP-12F-module/32557886226.html">price</a> of a nodemcu board, but in a smaller form factor.</p> <p>One of the greatest things of the ESP8266 ecosystem is <a href="https://github.com/nodemcu/nodemcu-firmware">nodemcu-firmware</a>, an environment allowing you to program the microcontroller in lua, greatly simplifying the prototyping and familiarization.</p> <p>After backing up the flash with <a href="https://github.com/themadinventor/esptool">esptool</a> (see <code>esptool read_flash</code>), I flashed the latest release of nodemcu-firmware. Then, using <a href="https://github.com/kmpm/nodemcu-uploader">nodemcu-uploader</a>, one can access the lua REPL (<code>nodemcu-uploader terminal</code>) and uploads lua scripts (<code>nodemcu-uploader --baud 9600 upload init.lua</code>); <code>init.lua</code> being the first script being run at powerup.</p> <h1>Quick doc</h1> <p>I reverse-engineered the various goodies that are on board, since I didn't find any documentation on this specific board online:</p> <p>Blue LED: use the PWM 4. High duty cycle = OFF.</p> <div class="highlight"><pre><span></span><span class="c1">-- Use a LED with a 500Hz PWM</span> <span class="kr">function</span> <span class="nf">led</span><span class="p">(</span><span class="n">pin</span><span class="p">,</span> <span class="n">level</span><span class="p">)</span> <span class="n">pwm</span><span class="p">.</span><span class="n">setup</span><span class="p">(</span><span class="n">pin</span><span class="p">,</span> <span class="mi">500</span><span class="p">,</span> <span class="n">level</span><span class="p">)</span> <span class="n">pwm</span><span class="p">.</span><span class="n">start</span><span class="p">(</span><span class="n">pin</span><span class="p">)</span> <span class="kr">end</span> <span class="c1">-- Control the Blue LED: 0 -&gt; 1023 higher means light off</span> <span class="kr">function</span> <span class="nf">blueLed</span><span class="p">(</span><span class="n">inverted_level</span><span class="p">)</span> <span class="n">led</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="n">inverted_level</span><span class="p">)</span> <span class="kr">end</span> <span class="n">blueLed</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="c1">-- test at high intensity</span> </pre></div> <p>RGB LED: use PWMs 8, 6, 7. High duty cyle = ON.</p> <div class="highlight"><pre><span></span><span class="c1">-- Control an RGB LED: three 0-&gt;1023 values; higher means more light</span> <span class="kr">function</span> <span class="nf">rgb</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="n">led</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="n">r</span><span class="p">)</span> <span class="n">led</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="n">g</span><span class="p">)</span> <span class="n">led</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="kr">end</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">500</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">-- test RED</span> </pre></div> <p>Button: GPIO 2. button pressed = 0 level.</p> <div class="highlight"><pre><span></span><span class="c1">-- launch connect() on button press</span> <span class="n">gpio</span><span class="p">.</span><span class="n">mode</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">gpio</span><span class="p">.</span><span class="n">INPUT</span><span class="p">)</span> <span class="n">gpio</span><span class="p">.</span><span class="n">trig</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s2">&quot;down&quot;</span><span class="p">,</span> <span class="n">connect</span><span class="p">)</span> </pre></div> <p>Light sensor: use the ADC.</p> <div class="highlight"><pre><span></span><span class="c1">-- Print light sensor value</span> <span class="nb">print</span><span class="p">(</span><span class="n">adc</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span> </pre></div> <h1>Going further</h1> <p>I then discovered the <a href="http://nodemcu.readthedocs.org/en/dev/en/">official nodemcu-firmware documentation</a> currently points to the dev branch; which has many new modules and functions I wanted to use (like the wifi event monitor or http module) that weren't available in master yet. I used the <a href="http://nodemcu-build.com/">nodemcu cloud builder</a>, a service provided by a kind community member to build a custom version of nodemcu-firmware on the dev branch and the modules I needed enabled.</p> <p>This allows to do this kind of code, that connects to wifi on a button press, and reacts with a simple HTTP request:</p> <div class="highlight"><pre><span></span><span class="kr">function</span> <span class="nf">connect</span><span class="p">()</span> <span class="c1">-- if wifi is already connected (config saved), launch job directly</span> <span class="kr">if</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">status</span><span class="p">()</span> <span class="o">==</span> <span class="n">wifi</span><span class="p">.</span><span class="n">STA_GOTIP</span> <span class="kr">then</span> <span class="n">doOnlineJob</span><span class="p">()</span> <span class="kr">return</span> <span class="kr">end</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">-- turn orange</span> <span class="kr">for</span> <span class="n">event</span><span class="o">=</span><span class="n">wifi</span><span class="p">.</span><span class="n">STA_IDLE</span><span class="p">,</span><span class="n">wifi</span><span class="p">.</span><span class="n">STA_GOTIP</span> <span class="kr">do</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">eventMonReg</span><span class="p">(</span><span class="n">event</span><span class="p">,</span> <span class="n">monCallback</span><span class="p">)</span> <span class="kr">end</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">config</span><span class="p">(</span><span class="s2">&quot;mynetworkssid&quot;</span><span class="p">,</span> <span class="s2">&quot;mynetworkpassword&quot;</span><span class="p">)</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">eventMonStart</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span> <span class="c1">--the event mon polls every 100ms for a change</span> <span class="kr">end</span> <span class="kr">function</span> <span class="nf">monCallback</span><span class="p">(</span><span class="n">prevState</span><span class="p">)</span> <span class="n">state</span> <span class="o">=</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">status</span><span class="p">()</span> <span class="kr">if</span> <span class="n">prevState</span> <span class="o">==</span> <span class="kc">nil</span> <span class="kr">then</span> <span class="n">prevState</span> <span class="o">=</span> <span class="s2">&quot;unknown&quot;</span> <span class="kr">end</span> <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Wifi status &quot;</span> <span class="o">..</span> <span class="n">prevState</span> <span class="o">..</span> <span class="s2">&quot; -&gt; &quot;</span> <span class="o">..</span> <span class="n">state</span><span class="p">)</span> <span class="n">blueLed</span><span class="p">(</span><span class="n">state</span><span class="o">*</span><span class="mi">204</span><span class="p">)</span> <span class="c1">-- led intensity depends on status, with success = OFF</span> <span class="kr">if</span> <span class="n">state</span> <span class="o">==</span> <span class="n">wifi</span><span class="p">.</span><span class="n">STA_GOTIP</span> <span class="kr">then</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">150</span><span class="p">)</span> <span class="c1">--blue/green-ish, wifi OK</span> <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Got IP &quot;</span> <span class="o">..</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">getip</span><span class="p">())</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">eventMonStop</span><span class="p">(</span><span class="s2">&quot;unreg all&quot;</span><span class="p">)</span> <span class="c1">-- stop event monitor</span> <span class="n">doOnlineJob</span><span class="p">()</span> <span class="kr">end</span> <span class="kr">if</span> <span class="n">state</span> <span class="o">==</span> <span class="n">wifi</span><span class="p">.</span><span class="n">STATION_NO_AP_FOUND</span> <span class="ow">or</span> <span class="n">state</span> <span class="o">==</span> <span class="n">wifi</span><span class="p">.</span><span class="n">STATION_CONNECT_FAIL</span> <span class="kr">then</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">150</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">-- red/fail</span> <span class="n">wifi</span><span class="p">.</span><span class="n">sta</span><span class="p">.</span><span class="n">eventMonStop</span><span class="p">(</span><span class="s2">&quot;unreg all&quot;</span><span class="p">)</span> <span class="c1">-- stop event monitor</span> <span class="kr">end</span> <span class="kr">end</span> <span class="kr">function</span> <span class="nf">doOnlineJob</span><span class="p">()</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">150</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">150</span><span class="p">)</span> <span class="c1">-- working, purple</span> <span class="n">http</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="s2">&quot;http://example.invalid/api/pushed&quot;</span><span class="p">,</span> <span class="kc">nil</span><span class="p">,</span> <span class="s1">&#39;{&quot;hello&quot;: &quot;from_esp_witty_42&quot;}&#39;</span><span class="p">,</span> <span class="kr">function</span><span class="p">(</span><span class="n">status_code</span><span class="p">,</span> <span class="n">body</span><span class="p">)</span> <span class="kr">if</span> <span class="n">status_code</span> <span class="o">==</span> <span class="kc">nil</span> <span class="ow">or</span> <span class="n">body</span> <span class="o">==</span> <span class="kc">nil</span> <span class="kr">then</span> <span class="nb">print</span><span class="p">(</span><span class="n">status_code</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">body</span><span class="p">)</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">200</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">--fail red</span> <span class="kr">return</span> <span class="kr">end</span> <span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Got code &quot;</span> <span class="o">..</span> <span class="n">status_code</span> <span class="o">..</span> <span class="s2">&quot; answer &quot;</span> <span class="o">..</span> <span class="n">body</span><span class="p">)</span> <span class="kr">if</span> <span class="n">status_code</span> <span class="o">==</span> <span class="mi">200</span> <span class="kr">then</span> <span class="n">rgb</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">200</span><span class="p">)</span> <span class="c1">--success, blue</span> <span class="kr">end</span> <span class="kr">end</span><span class="p">)</span> <span class="kr">end</span> </pre></div> <p>This is reproducing the software function of the <a href="https://aws.amazon.com/fr/iot/button/">DASH/IoT Button</a>, <a href="https://makeit.netflix.com/the-switch">Netflix Switch</a> or <a href="https://flic.io/">Flic</a>.</p> <p>There are <a href="http://benlo.com/esp8266/esp8266Projects.html#thebutton">a few projects</a> that will <a href="https://www.hackster.io/noelportugal/ifttt-smart-button-e11841">guide you</a> through the <a href="http://deqingsun.github.io/ESP8266-Dash-Button/">hardware part</a> of building a button with an ESP module.</p> <p>PS: Be careful of big https cert chains, there's a <a href="https://github.com/nodemcu/nodemcu-firmware/blob/7ff8326cc9ed430abcc215be651ab6b8588dc57b/app/http/httpclient.c#L384">hardcoded limit of 5120 bytes for the SSL buffer</a> in the firmware, that might make the handshake fail.</p> <p>PPS: <strong>2016-07-01</strong> I did a <a href="https://docs.google.com/presentation/d/1x2xQUi3j3OUK7PLJu5Mx89TK0mD3az2KCAk8yDGhv_4/pub?start=false&amp;loop=false&amp;delayms=60000#slide=id.p">talk on ESP8266 modules</a> at the Paris Embedded <a href="http://www.meetup.com/ParisEmbedded/events/231095289/">Meetup #9</a>.</p>Bépo-android2015-01-27T00:00:00+01:002015-01-27T00:00:00+01:00Anisse Astiertag:anisse.astier.eu,2015-01-27:bepo-android.html<p><a href="https://play.google.com/store/apps/details?id=fr.bepo.clavierexterne">This</a> is a small project I recently <a href="https://github.com/anisse/bepo-android/">released on github</a> and <a href="https://play.google.com/store/apps/details?id=fr.bepo.clavierexterne">Google Play</a>. It aims at catering to the needs of people using the bépo layout, and wanting to use it for <a href="http://bepo.fr/wiki/BepoAndroid">physical keyboards on Android</a>.</p> <p><a href="http://bepo.fr">Bépo</a> is a french dvorak-like keyboard layout; it was designed by a community of enthousiasts, and is now included in Xorg. Its platform support is pretty good on the three main PC OSes, but <a href="http://bepo.fr/wiki/Android">limited on Android</a>. For physical keyboards, there's a paid app that supports the bépo layout (as part of whole package of other keymaps) but it requires you to use it as your input method, whereas since Android 4.1 it's possible to have custom keyboard layouts exported by apps and managed by the system.</p> <p>It's this facility that is used by <a href="https://github.com/anisse/bepo-android/">bepo-android</a> (or <a href="https://play.google.com/store/apps/details?id=fr.bepo.clavierexterne">Bépo clavier externe</a>): a <a href="http://developer.android.com/reference/android/hardware/input/InputManager.html#ACTION_QUERY_KEYBOARD_LAYOUTS">simple intent property</a> you declare in your manifest to tell the system that you're exporting keyboard layouts, with which you point to an xml file listing all your keyboard layouts, each pointing to a single .kcm file. bepo-android is currently exporting a <a href="https://github.com/anisse/bepo-android/blob/master/clavierexterne/src/main/res/raw/bepo.kcm">single layout file</a>, generated for bépo.</p> <p>The bépo project has this wonderful tool called the <a href="http://git.tuxfamily.org/dvorak/pilotes.git/tree/configGenerator">configGenerator</a> that allows regenerating the whole set of keymap files, images of the layout, configuration files for all platforms in a single command; which runs a few shell, perl and python scripts. This allowed the project to move fast when making modifications, but is today used mostly by enthousiasts creating their own variants (I used to have such a variant, but now I just use the official one). I created <a href="https://github.com/anisse/bepo-android/blob/master/android.py">such a script for the android platform</a>. So the script generates the necessary .kcm file, and could be re-run with your own customized bépo layout as source. An alternative possible use would be to use its ability to read xkb files to export Xorg keymaps to Android. This would probably need adaptation to add support for more exotic languages and characters.</p> <p>Android has this two-tier keyboard management system. First, there is the <a href="https://source.android.com/devices/tech/input/key-layout-files.html">key layout</a>, which maps evdev keycodes into key names. The key names currently match the ones Linux's input.h. There's a default key layout you can use as a base, <a href="https://android.googlesource.com/platform/frameworks/base.git/+/android-5.0.2_r1/data/keyboards/Generic.kl">Generic.kl</a>. Then there's the <a href="https://source.android.com/devices/input/key-character-map-files.html">key character map</a>, that will tell the system which unicode character to input when you press a given key. This is at least two order of magnitudes simpler than Xorg's system; although xkb is also much more powerful.</p> <p>As you might have guessed, the file generated for this project is a key character map, that also uses the undocumented "<a href="https://android.googlesource.com/platform/frameworks/native/+/android-5.0.2_r1/libs/input/KeyCharacterMap.cpp#694">map</a> <a href="https://android.googlesource.com/platform/frameworks/native/+/android-5.0.2_r1/libs/input/KeyCharacterMap.cpp#795">key</a>" directive, to remap a part of the key layout to make sure it's not changing under us; for instance Archos has a different key layout for its keyboards than the default.</p> <p>One of the tradeoff I had to make was to have the underlying key layout a QWERTY, that then generated bépo characters. This decision was due to the fact that we cannot attribute key names to all keys: ÉÀÈÊÇ, etc. don't have key labels (found in <a href="https://android.googlesource.com/platform/frameworks/base/+/android-4.1.2_r2.1/include/androidfw/KeycodeLabels.h">KeycodeLabels.h</a> or <a href="https://android.googlesource.com/platform/frameworks/native/+/android-5.0.2_r1/include/input/InputEventLabels.h">InputEventLabels.h</a> depending on your AOSP version), so you can't simply remap all keys in order to have a bépo key layout; you'd have holes in it. I therefore had to resort to using QWERTY as base, as it's the one in <a href="https://android.googlesource.com/platform/frameworks/base.git/+/android-5.0.2_r1/data/keyboards/Generic.kl">Generic.kl</a>. I'm also secretly hoping it might help with badly-programmed games that have keyboard support, assume qwerty, and don't allow key remapping. If those exist on Android. (but they are legion on the web, which is very annoying).</p> <p>The bépo key character map also maps the <a href="https://source.android.com/devices/input/key-character-map-files.html#behaviors">documented special diacritical dead keys</a>; this is quite useful, but not as complete as xkb's many dead keys; and not nearly as powerful Xorg's Compose; so not all bépo dead and compose keys are supported.</p> <p>This currently only works with devices having Android 4.1+, provided the manufacturer didn't botch the external keyboard support, as Asus did on my Fonepad 7 (K019), and as a user reported a Samsung Galaxy Note 10.1 to be. OEMs do that to allow synchronisation between the virtual and physical keyboard layout, but this is just wrong if it removes the user's ability to chose his own keymap.</p> <p>I have yet to hear from other non-working devices; after a month or so, the app hasn't seen much traction (Google Play says there are less than 50 installs); so maybe in the niche that is bépo, there isn't much interest in typing stuff on Android. We could even wonder if people would ever do productive work on this platform. But that's a debate for another day.</p> <p>At least I scratched my itch =)</p> <p>Get <a href="https://android.googlesource.com/platform/frameworks/base.git/+/android-5.0.2_r1/data/keyboards/Generic.kl">Bépo clavier externe on Google Play</a>.</p>Testing a NAS hard drive over FTP2013-12-20T00:00:00+01:002013-12-20T00:00:00+01:00Anisse Astiertag:anisse.astier.eu,2013-12-20:testing-a-nas-hard-drive-over-ftp.html<p>So I have this NAS made by my ISP, that does a lot of things; but recently, I started having issues with its behavior. Recorded TV shows had lag/jitter while replaying, and the same happened with other types of videos I put on it. I narrowed it down to the hard drive, which was sometime providing read speeds of less than 300 KiB/s. I cannot open it to test the hard drive more thoroughly, using mhdd or ATA SMART tests. I'll have to innovate a little.</p> <p>In this post, in the form of an ipython3 notebook(<a href="/static/FTP%20fun.ipynb">source</a>), I'm going to test the hard drive over ftp, by doing a full hard drive fill, and then a read. I'm going to :</p> <ul> <li>measure the read and write speed to see if the problem is still present after I formated it.</li> <li>make sure what I wrote is the same as what I read</li> <li>I'll have to make sure that I can generate data fast enough</li> <li>And I'll try to make the data look "random" so that I don't stumble upon some compression in the FTP -&gt; fs &gt; hard drive chain.</li> </ul> <p>If everything is well I'll just get on with it: formatting the hard drive fixed the issue. Otherwise, it's might be a hardware problem, and I'll have to exchange it.</p> <p>To generate the data, I'll use an md5 hash for its nice output which looks fairly random, and this is very hard to compress. I chose md5 because it's fast. I'll use a sequential index as the input so that it's deterministic and I can fairly easily re-generate the input data for comparison.</p> <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">hashlib</span> <span class="n">h</span> <span class="o">=</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">&quot;md5&quot;</span><span class="p">)</span> <span class="c1">#Generate a deterministic hash</span> <span class="k">def</span> <span class="nf">data</span><span class="p">(</span><span class="n">i</span><span class="p">):</span> <span class="n">h</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="nb">bytes</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="k">return</span> <span class="n">h</span><span class="o">.</span><span class="n">digest</span><span class="p">()</span> <span class="kn">import</span> <span class="nn">time</span> <span class="k">def</span> <span class="nf">testdata</span><span class="p">():</span> <span class="n">n</span> <span class="o">=</span> <span class="mi">10000</span> <span class="n">size</span> <span class="o">=</span> <span class="n">h</span><span class="o">.</span><span class="n">digest_size</span> <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">):</span> <span class="n">data</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="n">end</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span> <span class="n">speed</span> <span class="o">=</span> <span class="n">n</span><span class="o">*</span><span class="n">size</span><span class="o">/</span><span class="p">(</span><span class="n">end</span><span class="o">-</span><span class="n">start</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;We generated </span><span class="si">%d</span><span class="s2"> bytes in </span><span class="si">%f</span><span class="s2"> s </span><span class="si">%d</span><span class="s2"> B/s&quot;</span><span class="o">%</span><span class="p">(</span><span class="n">n</span><span class="o">*</span><span class="n">size</span><span class="p">,</span> <span class="n">end</span><span class="o">-</span><span class="n">start</span><span class="p">,</span> <span class="n">speed</span><span class="p">))</span> <span class="n">testdata</span><span class="p">()</span> </pre></div> <pre> We generated 160000 bytes in 1.610000 s 99378 B/s </pre> <p>Ouch. I use a slow machine, and it's far from the at least 60MiB/s I need to thoroughly test the hard drive. Let's see if I can find a faster hash.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">testallhashes</span><span class="p">():</span> <span class="k">global</span> <span class="n">h</span> <span class="k">for</span> <span class="nb">hash</span> <span class="ow">in</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">algorithms_available</span><span class="p">:</span> <span class="n">h</span> <span class="o">=</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="nb">hash</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="nb">hash</span><span class="p">,</span> <span class="n">end</span><span class="o">=</span><span class="s1">&#39; &#39;</span><span class="p">)</span> <span class="n">testdata</span><span class="p">()</span> <span class="n">testallhashes</span><span class="p">()</span> </pre></div> <pre> SHA1 We generated 200000 bytes in 2.570000 s 77821 B/s SHA512 We generated 640000 bytes in 6.780000 s 94395 B/s RIPEMD160 We generated 200000 bytes in 2.870000 s 69686 B/s SHA224 We generated 280000 bytes in 3.480000 s 80459 B/s sha512 We generated 640000 bytes in 6.770000 s 94534 B/s md5 We generated 160000 bytes in 1.620000 s 98765 B/s md4 We generated 160000 bytes in 1.350000 s 118518 B/s SHA256 We generated 320000 bytes in 3.500000 s 91428 B/s ripemd160 We generated 200000 bytes in 2.870000 s 69686 B/s whirlpool We generated 640000 bytes in 19.590000 s 32669 B/s dsaEncryption We generated 200000 bytes in 2.580000 s 77519 B/s sha384 We generated 480000 bytes in 6.800000 s 70588 B/s sha1 We generated 200000 bytes in 2.570000 s 77821 B/s dsaWithSHA We generated 200000 bytes in 2.580000 s 77519 B/s SHA We generated 200000 bytes in 2.580000 s 77519 B/s sha224 We generated 280000 bytes in 3.490000 s 80229 B/s DSA-SHA We generated 200000 bytes in 2.570000 s 77821 B/s MD5 We generated 160000 bytes in 1.600000 s 99999 B/s sha We generated 200000 bytes in 2.570000 s 77821 B/s MD4 We generated 160000 bytes in 1.350000 s 118518 B/s ecdsa-with-SHA1 We generated 200000 bytes in 2.570000 s 77821 B/s sha256 We generated 320000 bytes in 3.490000 s 91690 B/s SHA384 We generated 480000 bytes in 6.780000 s 70796 B/s DSA We generated 200000 bytes in 2.590000 s 77220 B/s </pre> <p>Well, no luck. I'll just use a big buffer and have it loop around.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">bigbuffer</span><span class="p">():</span> <span class="k">global</span> <span class="n">h</span> <span class="n">h</span> <span class="o">=</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">&quot;md5&quot;</span><span class="p">)</span> <span class="n">buf</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">()</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">2</span><span class="o">**</span><span class="mi">18</span> <span class="o">//</span> <span class="n">h</span><span class="o">.</span><span class="n">digest_size</span> <span class="c1"># we want a 256KiB buffer</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">count</span><span class="p">):</span> <span class="n">buf</span> <span class="o">+=</span> <span class="n">data</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">return</span> <span class="n">buf</span> <span class="k">assert</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">bigbuffer</span><span class="p">())</span> <span class="o">==</span> <span class="mi">262144</span><span class="p">)</span> <span class="c1"># verify the length</span> </pre></div> <p>That's for the basics.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">CustomBuffer</span><span class="p">:</span> <span class="sd">&quot;&quot;&quot;</span> <span class="sd"> A wrap-around file-like object that returns in-memory data from buf</span> <span class="sd"> &quot;&quot;&quot;</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">limit</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">buf</span> <span class="o">=</span> <span class="n">bigbuffer</span><span class="p">()</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufindex</span> <span class="o">=</span> <span class="mi">0</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span> <span class="o">=</span> <span class="mi">0</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufsize</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">buf</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">limit</span> <span class="o">=</span> <span class="n">limit</span> <span class="k">def</span> <span class="nf">readloop</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="mi">8096</span><span class="p">):</span> <span class="n">dat</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">buf</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">bufindex</span><span class="p">:</span><span class="bp">self</span><span class="o">.</span><span class="n">bufindex</span> <span class="o">+</span> <span class="n">i</span><span class="p">]</span> <span class="n">end</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufindex</span> <span class="o">+</span> <span class="n">i</span> <span class="k">while</span> <span class="n">end</span> <span class="o">&gt;</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufsize</span><span class="p">:</span> <span class="n">end</span> <span class="o">-=</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufsize</span> <span class="n">dat</span> <span class="o">+=</span> <span class="bp">self</span><span class="o">.</span><span class="n">buf</span><span class="p">[:</span><span class="n">end</span><span class="p">]</span> <span class="bp">self</span><span class="o">.</span><span class="n">bufindex</span> <span class="o">=</span> <span class="n">end</span> <span class="k">return</span> <span class="n">dat</span> <span class="k">def</span> <span class="nf">read</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">i</span><span class="o">=</span><span class="mi">8096</span><span class="p">):</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">limit</span> <span class="o">==</span> <span class="bp">None</span><span class="p">:</span> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">readloop</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span> <span class="o">&gt;=</span> <span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">:</span> <span class="k">return</span> <span class="nb">bytes</span><span class="p">()</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span> <span class="o">+</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">:</span> <span class="n">dat</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">readloop</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">limit</span> <span class="k">return</span> <span class="n">dat</span> <span class="bp">self</span><span class="o">.</span><span class="n">fileindex</span> <span class="o">+=</span> <span class="n">i</span> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">readloop</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">def</span> <span class="nf">testreadcbuf</span><span class="p">():</span> <span class="n">f</span> <span class="o">=</span> <span class="n">CustomBuffer</span><span class="p">(</span><span class="mi">2548</span><span class="p">)</span> <span class="k">assert</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="mi">2048</span><span class="p">))</span> <span class="o">==</span> <span class="mi">2048</span><span class="p">)</span> <span class="k">assert</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">())</span> <span class="o">==</span> <span class="mi">500</span><span class="p">)</span> <span class="n">testreadcbuf</span><span class="p">()</span> <span class="k">def</span> <span class="nf">testcbuf</span><span class="p">(</span><span class="n">limit</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span> <span class="n">f</span> <span class="o">=</span> <span class="n">CustomBuffer</span><span class="p">(</span><span class="n">limit</span><span class="p">)</span> <span class="n">l</span> <span class="o">=</span> <span class="mi">0</span> <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10000</span><span class="p">):</span> <span class="n">l</span> <span class="o">+=</span> <span class="nb">len</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">())</span> <span class="n">end</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span> <span class="n">speed</span> <span class="o">=</span> <span class="n">l</span><span class="o">/</span><span class="p">(</span><span class="n">end</span><span class="o">-</span><span class="n">start</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;We generated </span><span class="si">%d</span><span class="s2"> bytes in </span><span class="si">%f</span><span class="s2"> s </span><span class="si">%d</span><span class="s2"> B/s&quot;</span><span class="o">%</span><span class="p">(</span><span class="n">l</span><span class="p">,</span> <span class="n">end</span><span class="o">-</span><span class="n">start</span><span class="p">,</span> <span class="n">speed</span><span class="p">))</span> <span class="n">testcbuf</span><span class="p">()</span> </pre></div> <pre> We generated 80960000 bytes in 0.780000 s 103794871 B/s </pre> <p>That's more in line with what we need.</p> <h1>the FTP stuff</h1> <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">ftplib</span> <span class="kn">from</span> <span class="nn">ftpconfig</span> <span class="kn">import</span> <span class="n">config</span><span class="p">,</span> <span class="n">config_example</span> <span class="c1"># ftp credentials, etc</span> <span class="k">print</span><span class="p">(</span><span class="n">config_example</span><span class="p">)</span> </pre></div> <pre> {'password': 'verylongandcomplicatedpassword', 'host': '192.168.1.254', 'path': '/HD/', 'username': 'boitegratuite'} </pre> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">ftpconnect</span><span class="p">():</span> <span class="n">ftp</span> <span class="o">=</span> <span class="n">ftplib</span><span class="o">.</span><span class="n">FTP</span><span class="p">(</span><span class="n">config</span><span class="p">[</span><span class="s1">&#39;host&#39;</span><span class="p">])</span> <span class="n">ftp</span><span class="o">.</span><span class="n">login</span><span class="p">(</span><span class="n">config</span><span class="p">[</span><span class="s1">&#39;username&#39;</span><span class="p">],</span> <span class="n">config</span><span class="p">[</span><span class="s1">&#39;password&#39;</span><span class="p">])</span> <span class="n">ftp</span><span class="o">.</span><span class="n">cwd</span><span class="p">(</span><span class="n">config</span><span class="p">[</span><span class="s1">&#39;path&#39;</span><span class="p">])</span> <span class="k">return</span> <span class="n">ftp</span> <span class="k">def</span> <span class="nf">transfer_rate</span><span class="p">(</span><span class="n">prev_stamp</span><span class="p">,</span> <span class="n">now</span><span class="p">,</span> <span class="n">blocksize</span><span class="p">):</span> <span class="n">diff</span> <span class="o">=</span> <span class="n">now</span> <span class="o">-</span> <span class="n">prev_stamp</span> <span class="n">rate</span> <span class="o">=</span> <span class="n">blocksize</span><span class="o">/</span><span class="p">(</span><span class="n">diff</span><span class="o">*</span><span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="p">)</span> <span class="c1"># store in MiB/s directly</span> <span class="k">return</span> <span class="p">[</span><span class="n">now</span><span class="p">,</span> <span class="n">rate</span><span class="p">]</span> <span class="k">def</span> <span class="nf">store</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">2</span><span class="o">**</span><span class="mi">25</span><span class="p">,</span> <span class="n">blocksize</span><span class="o">=</span><span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="p">):</span> <span class="n">values</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">def</span> <span class="nf">watch</span><span class="p">(</span><span class="n">block</span><span class="p">):</span> <span class="n">t2</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span> <span class="n">values</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">transfer_rate</span><span class="p">(</span><span class="n">t1</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">t2</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">block</span><span class="p">)))</span> <span class="n">t1</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">t2</span> <span class="n">ftp</span> <span class="o">=</span> <span class="n">ftpconnect</span><span class="p">()</span> <span class="n">buf</span> <span class="o">=</span> <span class="n">CustomBuffer</span><span class="p">(</span><span class="n">size</span><span class="p">)</span> <span class="n">t1</span> <span class="o">=</span> <span class="p">[</span><span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()]</span> <span class="k">try</span><span class="p">:</span> <span class="n">ftp</span><span class="o">.</span><span class="n">storbinary</span><span class="p">(</span><span class="s2">&quot;STOR filler&quot;</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">blocksize</span><span class="o">=</span><span class="n">blocksize</span><span class="p">,</span> <span class="n">callback</span><span class="o">=</span><span class="n">watch</span><span class="p">)</span> <span class="n">ftp</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> <span class="k">except</span> <span class="n">ConnectionResetError</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;Connection severed by peer&quot;</span><span class="p">)</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;Transfer interrupted:&quot;</span><span class="p">,</span> <span class="n">e</span><span class="p">)</span> <span class="k">return</span> <span class="n">values</span> <span class="n">values</span> <span class="o">=</span> <span class="n">store</span><span class="p">(</span><span class="mi">2</span><span class="o">**</span><span class="mi">27</span><span class="p">)</span> </pre></div> <p>Now trying to show those values !</p> <div class="highlight"><pre><span></span><span class="o">%</span><span class="n">pylab</span> <span class="n">inline</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">values</span><span class="p">)</span><span class="o">.</span><span class="n">transpose</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;rate (MiB/s)&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;time (s)&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </pre></div> <pre> Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline]. For more information, type 'help(pylab)'. </pre> <p><img alt="png" src="/images/FTP%20fun_14_1.png" /></p> <p>Ok, we have what we wanted: we can measure the write speeds.</p> <p>Now let's check read speeds.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">reread</span><span class="p">(</span><span class="n">blocksize</span><span class="o">=</span><span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="p">):</span> <span class="n">values</span><span class="o">=</span><span class="p">[]</span> <span class="n">verif</span> <span class="o">=</span> <span class="n">CustomBuffer</span><span class="p">()</span> <span class="n">i</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">def</span> <span class="nf">watch</span><span class="p">(</span><span class="n">block</span><span class="p">):</span> <span class="n">t2</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span> <span class="n">values</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">transfer_rate</span><span class="p">(</span><span class="n">t1</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">t2</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">block</span><span class="p">)))</span> <span class="n">dat</span> <span class="o">=</span> <span class="n">verif</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">block</span><span class="p">))</span> <span class="k">if</span> <span class="n">dat</span> <span class="o">!=</span> <span class="n">block</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;ERROR !!!! Data read isn&#39;t correct at block&quot;</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span> <span class="n">t1</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">t2</span> <span class="n">i</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span> <span class="n">ftp</span> <span class="o">=</span> <span class="n">ftpconnect</span><span class="p">()</span> <span class="n">t1</span> <span class="o">=</span> <span class="p">[</span><span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()]</span> <span class="k">try</span><span class="p">:</span> <span class="n">ftp</span><span class="o">.</span><span class="n">retrbinary</span><span class="p">(</span><span class="s2">&quot;RETR filler&quot;</span><span class="p">,</span> <span class="n">blocksize</span><span class="o">=</span><span class="n">blocksize</span><span class="p">,</span> <span class="n">callback</span><span class="o">=</span><span class="n">watch</span><span class="p">)</span> <span class="n">ftp</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s2">&quot;Transfer interrupted:&quot;</span><span class="p">,</span> <span class="n">e</span><span class="p">)</span> <span class="k">return</span> <span class="n">values</span> <span class="k">def</span> <span class="nf">plot_transfer_speed</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">title</span><span class="p">):</span> <span class="k">def</span> <span class="nf">average</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span> <span class="n">end</span> <span class="o">=</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">arr</span><span class="p">)</span><span class="o">//</span><span class="n">n</span><span class="p">)</span> <span class="k">return</span> <span class="n">numpy</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">arr</span><span class="p">[:</span><span class="n">end</span><span class="p">]</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">n</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span> <span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="o">.</span><span class="n">transpose</span><span class="p">()</span> <span class="n">a0</span> <span class="o">=</span> <span class="n">average</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="nb">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">//</span><span class="mi">300</span><span class="p">))</span> <span class="n">a1</span> <span class="o">=</span> <span class="n">average</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="nb">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">//</span><span class="mi">300</span><span class="p">))</span> <span class="n">lines</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">a0</span><span class="p">,</span> <span class="n">a1</span><span class="p">)</span> <span class="c1">#plt.setp(lines, aa=True)</span> <span class="n">plt</span><span class="o">.</span><span class="n">gcf</span><span class="p">()</span><span class="o">.</span><span class="n">set_size_inches</span><span class="p">(</span><span class="mi">22</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">&quot;MiB/s&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">&quot;seconds&quot;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="n">title</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> <span class="n">rval</span> <span class="o">=</span> <span class="n">reread</span><span class="p">()</span> <span class="n">plot_transfer_speed</span><span class="p">(</span><span class="n">rval</span><span class="p">,</span> <span class="s2">&quot;Read speed&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="/images/FTP%20fun_17_0.png" /></p> <p>We have all the pieces now. Let's do the filling and plot the data.</p> <div class="highlight"><pre><span></span><span class="c1"># About 230Gio is the effective size of the disk (almost 250Go)</span> <span class="n">val</span> <span class="o">=</span> <span class="n">store</span><span class="p">(</span><span class="mi">230</span><span class="o">*</span><span class="mi">2</span><span class="o">**</span><span class="mi">30</span><span class="p">,</span> <span class="mi">10</span><span class="o">*</span><span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="p">)</span> </pre></div> <pre> Connection severed by peer </pre> <div class="highlight"><pre><span></span><span class="n">plot_transfer_speed</span><span class="p">(</span><span class="n">val</span><span class="p">,</span> <span class="s2">&quot;Write speed&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="/images/FTP%20fun_20_0.png" /></p> <div class="highlight"><pre><span></span><span class="n">rval</span> <span class="o">=</span> <span class="n">reread</span><span class="p">(</span><span class="mi">10</span><span class="o">*</span><span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="p">)</span> <span class="n">plot_transfer_speed</span><span class="p">(</span><span class="n">rval</span><span class="p">,</span> <span class="s2">&quot;Read speed&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="/images/FTP%20fun_22_0.png" /></p> <h1>That's it !</h1> <p>We could also plot the read speed against the disk byte index instead of time, which would maybe be more interesting. This is left as an exercise for the reader.</p> <p>Regarding the data, it's far from what we could get with ATA SMART (using smartctl or skdump/sktest), but it's interesting nonetheless. We can see the read speed falls sometimes, which may be indicative of localized hard drive problem.</p> <p>What does not appear here is that I've ran the tests multiple times to make sure the data is correct. And both the read and write are long, multi-hour tests.</p> <p>Also, the simple fact of making a write test may fix an existent problem, by making the disk's firmware aware of the existence of bad blocks. This is amplified by the fact that I ran the tests multiple times.</p> <p>Finally, what could be improved is having a better way to display a high number of data points. I've used here the average method, which might not show how low the read/write speed can go locally. Maybe displaying the data using a vector format would be better (svg, python-chaco ?).</p> <p>Regarding the decision to dispose of the hard drive or the NAS, I think I'll keep it for now until it dies, but I'll start putting my data on a external HDD (plugged to the ISP box), and only trust the internal hard drive with low priority stuff like the occasional video recording.</p>Embedded Linux Conference Europe 2013 notes2013-10-30T00:00:00+01:002013-10-30T00:00:00+01:00Anisse Astiertag:anisse.astier.eu,2013-10-30:embedded-linux-conference-europe-2013-notes.html<p>So I was in Edinburgh this year, and I took notes as I usually do. These are intended for personal consumption (do no expect LWN-style reports), but as more people were asking me to share them, I thought why not do it in public ?</p> <h1>Embedded Linux timeline</h1> <p>by Chris Simmonds</p> <p>Busybox was started by Bruce Perens to solve the floppy installation problem. The first Linux Router was described in "Arlan Wireless Howto".</p> <p>Linux gained portability to other architectures over time:</p> <ul> <li>1995: MIPS</li> <li>1996: m68k, ppc</li> <li>1998: m68k Dragon Ball Palm Pilot : creation of uClinux (no mmu)</li> <li>1999: ARM</li> </ul> <p>Flash memory support was added by David Woodhouse in 1999 (MTD layer), then JFFS by AXIS for their IP cameras.</p> <h2>Devices</h2> <p>Things really started in 1999: AXIS IP camera, TiVo DVR, Kerbango Internet Radio(Threecom). Lot of media coverage at that time.</p> <p>Companies sprung out to service embedded linux: Timesys, MontaVista, Lineo, Denx.</p> <p>The handhelds.org project aimed at porting of linux to Compaq iPaq H3600. Cross-compiling being a pain in the arse, they had a cluster of ~16 iPaqs to compile code.</p> <p>In 2001, there was the infamous Unobtainium : handset prototype at Compaq based on iPaq hardware with GSM/CDMA/Wifi/Bluetooth, camera, accelerometer, 1GiB of storage: it was really the first smartphone prototype. Never shipped.</p> <p>At the same time, Sharp made the Zaurus running Linux 2.4.10 (software made by Lineo).</p> <p>In 2003, Motorola made the A760 handset, first Linux handset (MontaVista) In 2005, the Nokia 770, the first Internet Tablet running Maemo Linux.</p> <h2>Buildtools and software</h2> <p>In 2001 buildroot was created from the needs of the uClinux project: it's still the oldest and simplest build system. Then OpenEmbedded in 2003. Then in 2004 Poky Linux based on OE by OpenedHand, then Yocto. In my opinion Chris has a narrow view of the build systems choice (what about Debian?)</p> <p>Real-time was at first achieved with sub-kernels, like Xenomai. Then Native Real-time : Linux/RT 1.0 by Timesys in 2000. Then the voluntary preempt patch (Ingo Molnar &amp; Andrew Morton). Robert Love kernel preemption patch in 2001. In 2003 Linux 2.6 includes voluntary preempt. In 2005 PREEMPT_RT was started, in 2013, not all of it is merged yet.</p> <p>In the end: Linux is the "default" embedded OS.</p> <h1>How not to write x86 platform drivers</h1> <p>by Darren Hart</p> <p>This talk was mostly a feedback around getting the Minnowboard mainlined properly.</p> <p>At Intel a platform is CPU + chipset, or a SoC. In Linux, it represent things that are not on a real bus, or things that cannont not be enumerated, leading to board fils drivers.</p> <p>The Minnowboard uses a 32bit UEFI Firmware. One of the first designs to make use of all Queensbay(Intel SoC) GPIOs. The UART clock is special (50mhz). Low-cost Ethernet phy with no EEPROM for macs. The Minnowboard is a dynamic baseboard, which is very different from what Intel usually does: it supports daughter cards.</p> <p>There are three main sources of GPIOs on this board (5 core, 8 suspend, 12 pch), 4 user buttons, 2 user LEDs, phy reset, then expansion GPIOs.</p> <h2>Board files</h2> <p>MinnowBoard used board files at first because they are simple to use.</p> <p>Those were rejected. Why ?</p> <ul> <li>not automatically enumerated and loaded</li> <li>adds maintenance</li> <li>independent drivers had to be board aware</li> </ul> <p>All this leads to "evil" vendor trees.</p> <p>UART clock is firmware dependent. Previous code used DMI detection, which isn't nice.</p> <p>Ethernet is complicated: aggressive power saving meaning you must wake it up open. How to id the PHY ? You could use SMBIOS/DMI, DT, ACPI; in the end PCI subsystem ID were used. Initialized with platform_data.</p> <p>The MAC: no EEPROM, so had to solve how to get a MAC. Was done in firmware in the end: read the SPI flash, then write PCI registers.</p> <p>To preserve the platform on should not create vendor trees. The complexity of core kernel vs drivers is inverted: core kernel has simple primitives, but complex algorithms. Drivers are the opposite: simple to understand, but hard to organize, how they fit together.</p> <h2>GPIO, take 2: ACPI 5.0</h2> <p>A lot of things can be done with the new ACPI standard, like identify GPIO resources. You can't do keybindings, default trigger, etc. Some vendors (like Apple) do already, but with their own proprietary additions.</p> <p>One needs to write ASL for the DSDT. You might want to have dictionaries to describe your hardware, which needs to be standardized. Right now ACPI reserved method are used (_PRP).</p> <h1>Device trees for dummies</h1> <p>by Thomas Pettazoni (<a href="http://free-electrons.com/pub/conferences/2013/elce/petazzoni-device-tree-dummies/petazzoni-device-tree-dummies.pdf">Slides</a>)</p> <p>Before DT, all the information was inside the kernel. Little information in ATAGS. Now all information is in DT.</p> <p>Device Tree is a tree data structure to describe hardware that cannot be enumerated. You compile a DTS (source) into a DTB(binary). In arm, all DTS are in arch/arm/boot/dts and automatically compiled for your board. Device Tree bindings is the "standard" for how you should describe the hardware using the DT language, and what the driver understands. All bindings should be documented and reviewed. DT should not describe configuration, just hardware layout. The problem isn't solved yet for configuration.</p> <p>The talk had a lot of nice syntax examples to learn how to write DTS. For example, Thomas explained the importance of compatible string, which are used to match DTS node device with a driver.</p> <p>Should DT be an ABI ? Hard question. While it was the original idea, maybe it shouldn't. Current discussions seem to want to relax the stable ABI rule.</p> <h1>Use case power management</h1> <p>by Patrick Titiano (<a href="http://events.linuxfoundation.org/sites/events/files/slides/Use-Case%20Power%20Management%20Optimization%20ELC-E%20Presentation.pdf">Slides</a>)</p> <p>First rule about PM: shutdown anything not used. You need to track running power resources: it starts with the clock tree.</p> <p>Things to monitor:</p> <ul> <li>C-states/idle states stats.</li> <li>Operating point statistics</li> <li>CPU &amp; HW load</li> <li>memory bandwidth : most often a bottleneck</li> </ul> <p>You need to instrument both software and hardware. It means you need resistors points in the PCB, temp sensors. You need to automate everything, otherwise you're not comparing apple to apples. All measurements should be automated to be easily reproduced.</p> <p>You need to have power model of your raw soc consumption and characteristics, then you need to assess this model to verify that the target is realistic.</p> <p>Voltage is more important than frequency. It's easier to reduce consumption than to find better way to dissipate energy.</p> <p>Battery is king. You need a full system view, because you should optimize the biggest offenders first. Take care of inter-dependent stuff.</p> <h1>Android debugging tools</h1> <p>by Karim Yaghmour (<a href="http://opersys.com/downloads/android-platform-debug-dev-clean-131030.pdf">Slides</a>)</p> <p>Android usually runs on an SoC. It uses bionic, a different libc, it has a Hardware Abstraction Layer that allows proprietary drivers in userspace. Toolbox is lesser busybox clone in BSD.</p> <p>Binder is a standard component, object IPC, that almost defines system android. Every system services uses that.</p> <p>To debug, it's handy to load AOSP in eclipse. You can load all the OS classes and apps in the editor to be able to trace anything and browse AOSP while seeing classes, call chains, etc. It's more powerful for browsing and live debugging than common editors. You still have to build AOSP by hand (type <code>make</code>/<code>lunch</code>) to generate an image.</p> <p>A few tools:</p> <ul> <li><code>latencytop</code>, <code>schedtop</code>, etc.</li> <li><code>dumpsys</code></li> <li><code>service</code></li> <li><code>logcat</code></li> <li><code>dumpstate</code> (root app), <code>bugreport</code> : dumps system state (in /proc, etc.)</li> <li><code>watchprop</code></li> </ul> <p>Logging goes through the logger driver.</p> <p>Interfaces with the system:</p> <ul> <li><code>start</code>/<code>stop</code> : stops <code>zygote</code>, which means you can shutdown/start the interface and all the java stuff</li> <li><code>service call statusbar</code> {1,2,5 s16 alarm_clock i32 0} : you can call methods directly that are defined in an aidl file, by using their implicit sequence number. It's useful to bypass the java framework and call the services directly. see in android/os/IPowerManager.aidl for example, or android/internal/statusbar/IStatusBar.aidl for the previous example</li> <li><code>am</code> : tool to call intents. e.g am start -a android.intent.action.VIEW -d http://webpage.com . Very powerful tool to call intents</li> <li><code>pm</code> : calls to the package manager</li> <li><code>wm</code> : calls to the windows manager</li> </ul> <p>When working with AOSP sources, source <code>build/envsetup.sh</code>, it has very handy functions, like:</p> <ul> <li><code>godir</code>: jumps to a dir a file is in</li> <li><code>mm</code> : rebuilds the tree</li> <li><code>jgrep</code>/<code>cgrep</code>/<code>resgrep</code> : grep for specific files (java/c/resource)</li> <li><code>croot</code> : jump up to aosp root</li> </ul> <p>Take care when working it AOSP, it's BIG (about 8GB)</p> <p>When debugging, you have to use a different debugger depending on the use case:</p> <ul> <li><code>ddms</code> for dalvik level stuff</li> <li><code>gdb</code>/<code>gdbserver</code> for HAL stuff</li> <li>JTAG for kernel</li> </ul> <p>DDMS talks JDWP (Java Debug Wire Protocol). Use the one from AOSP, not eclipse. It's very powerful to debug (java) system processes live.</p> <p><code>gdbserver</code>: you have to configure your app's Android.mk to have -ggdb, and disable stripping. You also have to do port forwarding with adb in order to access gdbserver:</p> <div class="highlight"><pre><span></span>adb forward tcp:2345 tcp:2345 </pre></div> <p>You can use the prebuilts arm-eabi-gdb, but Multi-thread might not be supported.</p> <h3>logging</h3> <ul> <li>logcat works</li> <li>ftrace is supported through systrace</li> <li>atrace (device dependent ?)</li> <li>perf is not well supported on ARM</li> </ul> <h1>Embbedded build systems showdown</h1> <p>A nice panel, where we had representatives of different build systems: - DIY : Tim Bird, Sony - Android : Karim Yaghmour, Opersys - Buildroot : Thomas Petazzoni, Free Electrons - Yocto : Jeff Osier-Mixon, Intel</p> <p>It was very friendly. The take-out is that each system addresses different use cases. Yocto is big-company friendly, because it has metadata and licence management built-in. Tim Bird said he had a personal preference for Buildroot as a developer, although a division of his company recently switch to Yocto for its projects.</p> <p>Karim's opinion was that although Android wasn't community friendly, that it couldn't integrate with anything external, it was king in term of market traction, and that it might be the most used system in any kind of embedded device in 4 to 5 years.</p> <h1>Best practices for long-term support and security of the device-tree</h1> <p>by Alison Chaiken (<a href="http://events.linuxfoundation.org/sites/events/files/slides/DT_ELCE_2013.pdf">Slides</a> <a href="http://she-devel.com/DT_ELCE_2013.pdf">at Author's</a>)</p> <p>DT make life a bit easier, although there are pitfalls. Best practices could help with that matter.</p> <p>Updates are hard without DT. How about with DT ? Should you update the DTB ?</p> <p>DTs are supposed to be for HW description, but there already many configuration items in DT: MTD partition tables, boot device selection, pinmux. Alison gave example about automotive and battery technology that's evolving that would allow updating electric car's battery. Cars have a lot of processors; e.g 2014 Mercedes S-Class will have 200MCUs on Ethernet, CAN-FD and LIN.</p> <p>One thing to be careful about is Kconfig and DTS matching.</p> <p>One pitfall you might have, is unintentionally breaking DT or device by changing something in a driver or another device. Example about Koen Kooi's post who said you might blow an HDMI transceiver on some board if you boot with micro SD, because micro SD uses a higher voltage by default.</p> <p>You can use .its to bundle DTS, kernel and other blobs in one .itb file. Support was added in u-boot to sign .itbs by ChromeOS engineers.</p> <p>One option floating around, presented by Pantelis, is to use DTS runtime overlays as an update method, similar to unionfs.</p> <p>DTS schema validator looks like a good thing to have, like Stephen Warren's very recent proposal.</p> <h1>Android on a non-mobile embedded system</h1> <p>by Arnout Vandecapelle</p> <p>The main motivation is the reduced time to market, and the wealth of available app developers.</p> <p>It's interesting because it's still linux, but there are few differences (bionic libc, special build system)</p> <p>My own impressions: lots of generic stuff, from someone who just recently went into android. Like most of this stuff, doesn't come from Google, so it has little "new" information in it. It was a nice conference if you've never heard of AOSP and have only been using other embedded distros/build systems.</p> <h1>BuildRoot : What's new ?</h1> <p>by Peter Korsgaard</p> <h2>BuildRoot</h2> <p>BuildRoot is an Embedded Linux build system. It's one of the oldest out there, and is fairly well documented. It has an active community. It's relatively simple, and that's a focus of the project to Keep It Simple (Stupid). For example, there's no binary package management.</p> <p>It's Kconfig-based for configuration, and uses make for building.</p> <p>Buildroot is package-based, and a build step just runs every package build. It's therefore a meta build system. A package is composed of :</p> <ul> <li>a config (in kconfig format) for dependencies, description, etc. You need to include this config under the parent config option.</li> <li>a makefile (Package.mk) with the build steps</li> </ul> <p>Buildroot is using git for its source code, patches are posted on ML and managed in Patchwork.</p> <p>Buildroot activity has been growing over the years: more emails on ML(~1000/month), more contributors (30-40 each month). Developer days are held 2 times a year (this year at FOSDEM and ELCE).</p> <p>Buildroot is used in many products(Barco, Google fiber), and SDKs (Atmel, Synopsys, Cadence, Imagination…)</p> <h2>What's new ?</h2> <p>It supports more architectures (ARC, Blackfin, Microblaze, Xtensa…), and the variant support has been improved (ARM softfp/hardfp/neon…), as well as the nommu support.</p> <p>Buildroot now supports more toolchains: C library (glibc, eglibc, uclibc), and external toolchains.</p> <p>Buildroot has 30% more packages than last year. A lot of stuff has been added (gstreamer, EFL, wayland, systemd, perf, python3, nodejs…). A GSoC student worked on adding ARM proprietary GPU and video drivers support.</p> <p>QA has been improved as well, with continuous integration/regression testing. The development cycle is now 3 months, with 1 month of stabilization.</p> <p>License compliance has been added: every package should have a license, and "make legal_info" generates all the necessary stuff.</p> <p>There's a new Eclipse CDT plugin. Popular boards got their own defconfigs to ease starting.</p> <p>A lot of configuration options were added to the menuconfig. New options to add a rootfs overlay, or last-second hook scripts.</p> <p>Upcoming work includes external packages overlays, SELinux support, updated systemd/udev, and whatever else gets submitted.</p>Hello, World!2013-10-20T00:00:00+02:002013-10-20T00:00:00+02:00Anisse Astiertag:anisse.astier.eu,2013-10-20:hello-world.html<p>So I finally did it. You're reading it right now. My personnal website/blog.</p> <p>I should be posting here about things that cross my mind as well as various projects I've been working on. And maybe even new projets I didn't even start yet.</p> <h3>The design</h3> <p>First of all, kudos to Pascal Navière, a very talented web designer that did the design of this site(CSS, DOM structure, etc.), which I then modified. All bugs are therefore my own additions.</p> <h3>The tech</h3> <p>The DNS you used to access this website is hosted by <a href="https://gandi.net">gandi</a>. The website itself resides at <a href="http://kimsufi.com">OVH</a>, who used to sell the world's cheapest VPS (they're currently out of stock for all their products, but I won't go into that). The SSL certificate is provided by <a href="http://www.startssl.com/">StartSSL</a>.</p> <p>On this VPS, <a href="http://debian.org">Debian</a> Wheezy, with <a href="http://nginx.org">nginx</a> serving the actual pages. Pages which are all old scholl static <a href="http://www.w3.org/TR/html-markup/">HTML</a>(5), generated by <a href="http://docs.getpelican.com">Pelican</a>.</p> <p>On my machine pelican is run with <a href="http://python.org">python</a> 3.3, in a <a href="http://docs.python.org/3/library/venv.html#creating-virtual-environments">venv</a> where <a href="http://python-distribute.org/distribute_setup.py">distribute was installed</a>. The content is edited with <a href="https://vim.org">vim</a> on <a href="http://fedoraproject.org/">Fedora</a> 19.</p>