Avi Press is the founder of Scarf. He has a decade of development experience building high-scale systems at companies like Pandora, as well as open-source developer tools in his spare time.
Open-source package maintainers generally know very little about how their packages are being used. Can package registries help?
I want to understand your background more, and how you came to start Scarf — 0:17
Avi details start in software engineering at Pandora and startup world.
During that period, started getting into open-source while building dev tools, to make own job easier day-to-day. Those tools started becoming integral and took off.
Eventually, saw that commercial entities were using his tools, and saw an opportunity.
The atomic unit of Scarf revolves around a registry. How do you define a registry? — 2:50
“Registry is the place that turns a name into a resource, an artifact that you can download.”
People are most commonly interacting with registries via package manager, although with Go, can interact with registry directly in code.
Why are registries such an important, fundamental aspect of open source? — 4:00
Registries are basically backbone infrastructure, for how we distribute code to one another. A key part of how open-source has taken off. Would have been much harder to always build code yourself in order to adopt a piece of software.
Registries have made it easy to pull a piece of software into your project.
“What does this name refer to?” That’s part of why registries are crucial.
If a registry like NPM suddenly went offline, engineering around the world would grind to a halt.
What question should open-source creators be able to ask of their registries? What should we expect of registries in the open-source world? — 5:28
What’s your business model? It’s free today, but is it free tomorrow? What are the implications of that? How does it stay online? Is it a business? If so, how is it funded? If not, will it stick around?
Do I have to use this registry? Are there other options?
What information will the registry share with me, beyond downloads, that could be useful?
Registries should be aligned with the maintainers interests. How can maintainers have more insight into the registry as a distribution channel?
Users should want to know a lot about the registry and what its incentives are.
What data do we currently get from registries, in the current form? What do we not get from registries? - 7:30
Most registries show download counts of a package, sometimes in graph form.
Not much else. Some exceptions. Some show version counts and build matrixes.
But there is so much data which isn’t being exposed. Unclear which versions are being used, what platforms downloads are coming from, # of unique visitors, the commercial users.
Wish: What packages are being pulled alongside of mine? Meaning, what maintainer should I know?
More information about a package within a dependency tree.
What data could be useful for open-source maintainers/creators to get from registries? - 9:15
Version adoption is low-hanging fruit that could have large impact. Could show how much time it takes for a new version to proliferate (in some cases, new versions and bug fixes never even reach users if they dont update packages).
Also, getting unique and commercial users is useful for business decisions.
This data can help maintainers be more proactive, which would give OSS higher quality across the board.
Code and data are kind of one-and-the-same. In OSS, we don’t talk about the data about the code. Usage data could have a large impact for maintainers.
Talk about decisions for picking registries and switching costs. What are the implications on betting on a given registry/company? What are the switching costs involved? - 11:50
The choice of registry depends on the ecosystem. Docker gives you lots of options, because Docker client can be pulled down by URL.
Even in environments where there is support for pulling down from different registries, switching costs in the URL. If you want to change the registry, you’ll have to get a URL on the package. Each time you do that, you’ll likely lose users permanently. If you decide to do that, it should be permanent, because you will incur user cost every time.
Mass centralization of registries makes things brittle and misaligns incentives.
If it were easier to switch to a registry that was building a better product, we would have better registries, and we’d have better software.
So you’re passionate about getting more measurement, telemetry, data, in particular, to better understand who is using the open source you create as a developer. Similar to Salesforce’s motto, you’re bringing developers closer to their customers. That brings us to Scarf. What are you building, what vision are you executing on, and what exists today? — 15:35
Goal of Scarf is to give maintainers more ownership of the distribution of their software.
Building a suite of tools that can help answer questions for maintainers, even if registry doesn’t provide that information.
More recently, shifting to the registry side. The data is there. Scarf wants that to be shared more with maintainers.
Working on new tools to get insights in your package distribution, no matter where you’re hosting your package.
At end of day, about empowering open source maintainers to build better software and have better businesses around their software.
Working on new product to get at this in the Docker space. If you’re interested in learning more about how your Docker packages are being pulled down, get in touch.
What’s the best place for people to find out about Scarf on the internet? - 19:00
Scarf.sh for website/blog and @scarf_OSS on Twitter.
For email Avi@scarf.sh, always eager to chat and connect with people who are interested!
Share your questions and comments below!