OCaml Weekly News
Hello
Here is the latest OCaml Weekly News, for the week of July 08 to 15, 2025.
Table of Contents
- OCaml Language Committee: an update on a policy for conflicts of interest
- OCaml intern for Claudius
- An example for every OCaml package
- Esa 0.1.0 - Enhanced Suffix Arrary(and further plans)
- Tutorial: cut and pasting code
- QCheck 0.24
- New Odoc-3-Generated Package Documentation is Live on OCaml.org
- Lwt.6.0.0~alpha (direct-style)
- MirageOS on Unikraft
- Other OCaml News
- Old CWN
OCaml Language Committee: an update on a policy for conflicts of interest
octachron announced
When discussing the proposition for include functors, the language committee felt in the rabbit hole of discussing conflict of interests.
After some discussions, as the current committee chair, I have decided to propose to amend the committee description with our current understanding of transparency-based policy for conflicts of interest at https://github.com/ocaml/RFCs/pull/55 .
The core idea behind that policy is that in a small-world community like ours, trying to completely avoid conflicts would be counter-productive. At the same time, making sure that anyone is aware of potential conflicts is fairer for all participants.
Thus the current proposal, which is not definitive. In particular, if we have any comments, you are more than welcome to participate in the discussion in the RFC above.
OCaml intern for Claudius
Shreya Pawaskar announced
Hello Everyone! đź‘‹đź‘‹
I am Shreya Pawaskar, an outreachy intern working on the Claudius Project.
A little late to post my blogs here. But here we go!
This is my first blog. Here I've talked about my experience during the contribution phase and my fav contribution.
And here's the second one. This one is all about the journey and the learnings I made working with ocaml-gif. And ofcourse, the beautiful cover image for my second blog is built and captured with Claudius!
An example for every OCaml package
John Whitington announced
(One day, maybe).
Wouldn't it be nice if every OCaml package had examples as well as documentation? As part of a pilot programme funded by the OCaml Software Foundation, I've been looking into the feasibility of such an idea.
What do we mean by examples, and what distinguishes them from documentation and from tests?
What an example is
- Examples are independent of the library they explain. They do not require the source of the library, or any built artefacts.
- Examples are self-contained. They require only OCaml, a build system, and the library in question to be installed.
- Examples are easy to build. They are built in a single command, and do not depend on environment.
- Examples are easy to edit and play with. They are of a reasonable size, split into chunks, and are commented liberally.
- Examples use standard techniques. Both in how the library is used, and in how the OCaml code is written.
- Examples are open licensed. Users should be able to copy & paste code from the examples without care.
What an example is not
- Examples are not tests. Unlike tests, examples do not care about code coverage, cannot be automatically generated, and need not necessarily be tightly integrated into the source repository.
- Examples are not in the API documentation. Examples need to be buildable, and separate from the API documentation. This is not to say that they might then not be automatically imported into the API documentation one day.
- Examples need not be comprehensive. Better a small example than no example at all. So long as the basics of an API are introduced, the cliff is climbed and the API documentation should thereafter suffice.
Pilot project plan
The plan was to build small examples for about twenty packages, put together a place for them to live, and then try to upstream them. The examples' home, prior to upstreaming, is the OCaml Nursery:
https://github.com/johnwhitington/ocaml-nursery
Most of these little examples have been submitted to upstream - you may have noticed the pull requests on your repositories - with varying degrees of interest / success.
Opinions requested, please!
Are you interested in adding examples for your package or someone else's package? To the nursery or to upstream? What do you think of the definition of example I gave above? Do you think examples should sit in a separate space like the nursery or be upstreamed or both? Opinions requested on all those topics, please!
Esa 0.1.0 - Enhanced Suffix Arrary(and further plans)
Geoffrey Borough announced
I just ported the original C++ Enhanced Suffix Tree to pure OCaml, you can find it here: https://github.com/gborough/esa.
It's the first time I have attempted at writing low allocation/no allocation code in OCaml and I must say this has been a great learning experience for the past few weeks, and it makes me appreciate more how OCaml is able to provide low level tunings that match other low level languages, whilst staying functional at the same time.
One of my personal goals(also our company tech alignment) is to bring OCaml up to the same level of convenience as Python in some areas of AI/LLM. We are inspired by existing efforts in the OCaml community to take on this challenge and our plan of attack will be more or less similar. Currently we are tackling the following problems:
- Porting Google Sentencepiece(in progress): Enhanced Suffix Arrary done as a dependency, Double-Array Trie and few other tokenizer utilities in progress.
- Porting Hugging Face Tokenizers(in progress): Pending the completion of sentencepiece though less dependent codes are being converted.
The end product probably contains a mixture of pure OCaml as well as a fair amount of FFI code. I dread to think how they are going to look like obviously there will be a ton of verbatim translations to OCaml, but I have little doubt about matching C++/Rust performance most of the time. We'll also look into the upcoming OxCaml extension to see if more performance can be eked out.
Hopefully we will have something to show for the community in the near future.
Tutorial: cut and pasting code
Daniel BĂĽnzli announced
Dear all,
I sometimes notice that my code gets cut and pasted or vendored litterally or modified in other projects. That's very fine, it's the reason why I publish almost all my code under a license that makes that extremely simple.
Yet people often fail to abide by the simple, single phrase request of the license which is (emphasis added):
Copyright (c) [year] [fullname]
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
In all my source files you can find a header that has the copyright notice and a SPDX license identifier as per convention (example). So I kindly ask you when you reuse my code:
If you copy the whole file to retain that copyright header. Reformatting the header is ok. Also if you end up reworking the code significantly indicating:
(* Part of this code is based on the xxx project and Copyright (c) 2020 The xxx programmers. SPDX-License-Identifier: ISC *)
works for me.
If you copy say a single function please add a comment with the copyright notice and the SPDX license identifier. For example:
(* This code is Copyright (c) 2011 The xxx programmers. SPDX-License-Identifier: ISC *)
- If you end up vendoring a significant part of the code without modifying
it (that is if technically you depend on the project) please mention it
in your toplevel
LICENSE
file.
Now let's be clear, I will not call the police if don't do this or if you pretend to have written code you did not. Police and lawyers are at the top of the list of people I do not want to deal with or inflict onto other people.
What I'm seeking here is attribution. Not for having my name in your project, I couldn't care less and the copyrights of my projects are contributor based anyways. This is so that the code contribution can be traced for the little times I manage to convince people to pay me for making them rather than investing my own money in these projects.
Btw. this should not only be about my code. This is about any open source code you cut and paste from (and I also do this).
Ah and yes, please, if you are using them, also teach your LLMs to do that. If they are able to write OCaml code it's also thanks to me :–)
Thank for your attention.
Daniel
QCheck 0.24
Jan Midtgaard announced
QCheck 0.26 is now available from your favorite opam repository! :tada:
https://github.com/c-cube/qcheck/releases
The 0.26 release adjusts the QCheck
and QCheck2
float
generator distributions, which was previously confined to a subset of floating point numbers. Users may experience that existing tests known to pass start to fail with the new and broader distribution. In addition the release contains a number of other fixes and documentation improvements, incl. the removal of an annoying newline which would cause the test suite to fail on OCaml 5.4.0:
- Align printed
collect
statistics and also add a percentage - Fix
QCheck{,2}.Gen.float
generator which would only generate numbers with an exponent between 2^{-21} and 2^{22} - Elaborate on the
QCheck~/~QCheck2
situation in the README - Add a missing
description
field to the *.opam files - Document
Shrink
invariants in theQCheck
module - Fix a
qcheck-ounit
test suite failure on OCaml 5.4, removing a needless extra newline - Fix
QCheck2
float_range
operator which would fail on negative bounds - Fix
QCHECK_MSG_INTERVAL
not being applied to the first in-progress message
Thanks to @Pat-Lafon, @rmonat, and @kit-ty-kate for contributing! :pray:
New Odoc-3-Generated Package Documentation is Live on OCaml.org
Archive: https://discuss.ocaml.org/t/new-odoc-3-generated-package-documentation-is-live-on-ocaml-org/16967/1
Sabine Schmaltz announced
Hi everyone,
I just merged the patch https://github.com/ocaml/ocaml.org/pull/3124 which enables the new and improved package documentation built with odoc 3 on OCaml.org. Thanks @mtelvers, @jonludlam, @panglesd for putting in the effort to make this happen for OCaml.org!
Thanks to everyone who gave us feedback when we ran this on the staging environment (https://discuss.ocaml.org/t/help-test-the-new-odoc-3-powered-package-documentation-pages/16795/6), we're reasonably confident that things work well enough to apply this upgrade. :orange_heart:
In case you see something that could be improved, please let us know (by replying here or by opening an issue on https://github.com/ocaml/ocaml.org)!
Cheers Sabine
Lwt.6.0.0~alpha (direct-style)
Raphaël Proust announced
It is a great pleasure to announce the release of the first alpha release of Lwt 6. This major version bump brings two major changes to Lwt:
- Using Lwt in direct-style! (Big thanks to @c-cube !!)
- Using multiple Lwt schedulers running in separate domains!
Direct-style
This contribution from @c-cube is available in alpha00. It comes in the form of an lwt_direct
package which provide an Lwt_direct
module which provide two core functions:
val run : (unit -> 'a) -> 'a Lwt.t
val await : 'a Lwt.t -> 'a
and allows you to write code such as
run (fun () ->
let continue = ref true in
while !continue do
match await @@ Lwt_io.read_line ic with
| line -> await @@ Lwt_io.write_line oc line
| exception End_of_file -> continue := false
done)
There are a few more functions. All of which is documented in lwt_direct.mli.
Multi-scheduler
This addition is not available in alpha00 but should be added to alpha01 soon. It allows to call Lwt_main.run
in different domains and benefit from actual parallelism. (Sneak peek in this pull request)
Installation
lwt.6.0.0~alpha00
and lwt_direct.6.0.0~alpha00
will soon be released on opam (PR on opam-repo. I'll publish some more alphas as the work progresses, and announce the releases on this thread.
You can also pin the packages to the lwt-6 branch to get everything a little bit earlier:
opam pin lwt https://github.com/ocsigen/lwt.git#lwt-6 opam pin lwt_direct https://github.com/ocsigen/lwt.git#lwt-6
Feedback
Don't hesitate to chime in on here with any feedback you may have. Ideas, comments, requests, suggestions, etc.
MirageOS on Unikraft
shym announced
On behalf of all the developers involved (namely @fabbing, @Firobe, @n-osborne and me), it’s my pleasure to announce that the first release of the Unikraft backend support in MirageOS unikernels.
Unikraft is a unikernel development kit: it is a pretty large collection of components that can be picked up, or not, in the unikernel tradition of modularity. The scope of Unikraft is much larger than Solo5, as it aims to make it easy to turn any Unix server into an efficient unikernel.
This was in fact a first motivation to explore using Unikraft as MirageOS backend: to experiment and see what performance we could get, in particular using their virtio
-based network interface, as virtio
is implemented currently only for one specific x86_64
-only backend in Solo5.
Some of the immediate performance differences we observed are detailed further, but that is not all we hope from this Unikraft backend in the long-term. In particular, Unikraft is on the road to be multicore-compatible (i.e. having one unikernel use multiple cores). While this is not ready today and there are still significant efforts to get there, it means that this MirageOS backend will be able to benefit from these efforts and eventually support the full feature set of OCaml 5.
Furthermore, the Unikraft community (which is quite active) is experimenting with a variety of other targets such as bare-metal for some platforms or new hypervisors (e.g. seL4). Any new target Unikraft supports can be then supported "for free" by MirageOS too. For example, this already brings firecracker
as a new supported VMM for MirageOS.
Lastly, since Unikraft is POSIX-compatible (for a large subset of syscalls), this potentially enables MirageOS unikernel to embed OCaml libraries that have not been ported to use the Mirage interfaces in the future. This would be useful for large libraries which are hard to port (owl comes to mind).
Overview of the Unikraft support
To add new MirageOS backends requires to create or modify a series of components:
- a OCaml cross compiler that can build this new backend, in particular by building its corresponding runtime and providing a way to build unikernel images (instead of normal executables),
- new libraries for the Unikraft system support, and its network and block devices,
- support for the new backends in the
mirage
tool.
Using Unikraft with a QEMU or a Firecracker backend is as simple as choosing the unikraft-qemu
target or the unikraft-firecracker
one when configuring a unikernel.
- The OCaml/Unikraft cross compiler
To build the OCaml cross compiler to Unikraft, we use the Unikraft core, the Unikraft lib-musl and musl itself. musl is the C library recommended by Unikraft to build programs using the POSIX interface. This made it easy to build the OCaml 5 runtime, in particular because it provides an implementation of the
pthread
API which is now used in many places in the runtime[^*]. This could also make it easier to port some libraries that depend onUnix
to work on Unikraft backends.[^*]: Adding support for Thread-Local Storage has been a large part of the work to get OCaml 5 working on Solo5: even if the creation of threads is not supported, TLS is still necessary to get the runtime to compile.
The OCaml cross compiler per se builds upon the work that has been upstreamed to ease the creation of cross compilers, using almost the same series of patches than for
ocaml-solo5
. So the only version of the compiler that is currently supported for OCaml/Unikraft is OCaml 5.3. Almost all the patches will be in the upcoming OCaml 5.4 and there should no longer be any patches required by OCaml 5.5.Note that we didn’t go with the full standard Unikraft POSIX stack, which includes lwIP to provide network support. We had a prototype at some point relying on lwIP to validate our progress on other building blocks but it raised many incompatibility issues with the standard MirageOS network stack so we dropped support for lwIP in that first release; we developed instead the libraries required to plug the MirageOS stacks into the low-level interfaces provided by the Unikraft core.
- The new MirageOS libraries for Unikraft support
The Unikraft support comes with packages using the standard names:
mirage-block-unikraft
andmirage-net-unikraft
to support the block and network devices. Those libraries are implemented directly on top of the low-level Unikraft APIs, and so are usingvirtio
on both QEMU and Firecracker VMMs. To evaluate the quality of the implementations for those devices, we ran a couple of small benchmarks. You can find those benchmarks (the unikernels along with some scripts to set them up and run them) in thebenchmarks
directory in @Firobe’s fork of mirage-skeleton,benchmarks
branch.- Network device
To measure the performance of the network stack, we have tweaked the simple network skeleton unikernel to compute some statistics and used a variable number of clients all sending 512MB of null bytes. We have run this benchmark both on a couple of
x86_64
laptops and on a LX2160aarch64
board, all running a GNU/Linux OS.We have observed a lot of variability in the performance of the
solo5-spt
unikernel (sometimes better, sometimes worse thanunikraft-qemu
) depending on the actual computer used, so those measures should be read with a grain of salt.On two different
x86_64
laptops:On the LX2160
aarch64
board: - Block device
To measure the performance of the block devices, we wrote a simple unikernel copying data from one disk to another. We can see that the performance of
unikraft-qemu
is lower thansolo5-hvt
for small buffer sizes; fortunately, the situation improves with larger buffer sizes. We ran this benchmark only on ax86_64
laptop as there’s currently an issue with two block devices onaarch64
on Unikraft.It is worth mentioning that I/Os can be parallelised, which also gives a significant performance boost. Indeed,
mirage-block-unikraft
can leverage the parallelised virtio backend of QEMU and Firecracker; it takes care of limiting I/Os to what the hardware supports in terms of both parallelism and sector size.
- Network device
- Current limitations
- In our tests only Linux appeared well supported to compile Unikraft at the moment so we’ve restricted our packages to that OS for now.
- Unikraft supports various backends itself; in this first release, we’ve only added support and tested its two major ones: QEMU and Firecracker.
How to use
To try the new Unikraft backend for MirageOS, you need to use an OCaml 5.3 switch, so create one first if needed. Then add our opam overlay to get access to our latest versions of the packages until they are published on the standard repository and install mirage
and the OCaml/Unikraft cross compiler. The short version could be:
$ opam switch create unikraft-test 5.3.0 $ opam repo add mirage-unikraft-overlays https://github.com/Firobe/mirage-unikraft-overlays.git $ opam install mirage ocaml-unikraft-backend-qemu ocaml-unikraft-x86_64
See below for some explanations about the numerous OCaml/Unikraft packages.
>From then on, you can follow the standard procedure (see how to install MirageOS and how to build a hello-world unikernel) to build your unikernel with the Unikraft backend of your choice.
$ mirage configure -t unikraft-qemu $ make
- Details about the various packages for the OCaml/Unikraft cross compiler
The OCaml cross compiler to Unikraft is split up into 14 packages (see the PR to
opam-repository
for more details) so that users can:- choose which of the backends (QEMU or Firecracker) and which of the architectures (
x86_64
andarm64
) they want to install, where all combinations can be installed at the same time, - choose which architecture is generated when they use the
unikraft
ocamlfind toolchain by installing one of the twoocaml-unikraft-default-<arch>
package, - install the
ocaml-unikraft-option-debug
to enable the (really verbose!) debugging messages.
The virtual packages can be installed to make sure one of the architecture-specific packages is indeed installed:
ocaml-unikraft
can be installed to make sure that there is indeed aunikraft
ocamlfind toolchain installed,ocaml-unikraft-backend-qemu
andocaml-unikraft-backend-firecracker
can be intalled to make sure that theunikraft
ocamlfind toolchain supports the corresponding backend.
Those virtual packages will be used in particular by the
mirage
tool when the target isunikraft-qemu
orunikraft-firecracker
.All those packages use one of two version numbers. The backend packages use the Unikraft version number they are using, while the OCaml compiler packages per se use version
1.0.0
. - choose which of the backends (QEMU or Firecracker) and which of the architectures (
Conclusion
This is a first release, which we are experimenting with; we expect to run it in production in the coming months but it may need improvements nevertheless. Notably absent from this release is an early attempt to leverage Unikraft’s POSIX compatibility to implement Mirage interfaces instead of hooking directly to Unikraft’s internal components. This early version used Unikraft’s lwIP
-based network stack instead of Mirage’s (fooling Mirage into thinking it was running on Unix), and it may be interesting to revisit this kind of deployment, in particular for easy inclusion of unix-only OCaml libraries in unikernels.
We are eager for reviews, comments and discussion on the implementation, design and approach of this new Mirage backend, and hope it will be useful to others.
Other OCaml News
From the ocaml.org blog
Here are links from many OCaml blogs aggregated at the ocaml.org blog.
Old CWN
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe to the caml-list.