package author identity · ideas · Zulip Chat Archive

Stream: ideas

Topic: package author identity

Richard Feldman (Nov 19 2021 at 20:44):

so Roc doesn't have a package manager yet, but one of the design questions I've been thinking about for a long time is how to deal with package author identity.

This thread talks about a few different options, but as far as I can tell they boil down to a few different approaches:

Elm-style (package name == GitHub repository, GitHub is used for identity and authentication), e.g. the package rtfeldman/elm-css corresponds to the repo of the same name.
Rust-style (no namespaces, using GitHub for authentication but not for package identity) - e.g. rand
npm-style (optional namespaces, using their own signup system for authentication and identity) - e.g. react is un-namespaced, but @google/cloud-analytics is
Maven-style (namespaces are required and based on domain names where you have to prove ownership; domain is identity, PGP key for authentication) - e.g. org.springframework:spring
Deno-style (import source code directly from a URL, URL is both identity and authentication) - e.g. https://example.com/my/package

I'd love to hear different perspectives on what Roc should do here!

Brendan Hansknecht (Nov 19 2021 at 20:50):

I don't have much in the way of contexts or ideas here, but I am definitely pro explicit namespaces. Personally, 1 and 5 (assuming forced https) look the most reasonable to me.

Though both rely on external sources for hosting and can more easily be targeted by a malicious user...which is not optimal.

Lawrence Job (Nov 19 2021 at 21:23):

My thinking is that a good package manager would be decentralised or maintained by the community, but able to use big repos like Github as a backend. One day this will actually be a decent use-case for blockchain, but I think that's a bit optimistic in 2021.

Not reliant on a single provider (even if Github is de facto, I'd want Roc to outlive GitHub)
Signed or verifiable source
** Who made it?
** Is anyone else willing to countersign versions of a package as 'trusted'?
Immutable history of published code
Presumably sits alongside local modules?

Noob question: Are packages in this sense platforms AND source modules? Are they compiled artefacts or source? (Is that going to be a problem for platforms?)

Countersigning

I'd really like to see a package manager that allows third parties to audit/advocate for a library, and for people to choose which authorities they're willing to trust. Hypothetically, it would be nice to see that roc-math (or whatever) has been countersigned as safe by teams/people at Google, NoRedInk, whatever (on a version by version basis), where users can decide 'yeah I trust Apple but not NRI'.

Decentralising has the issue that anyone can contribute a package, and the onus is on the developer to fully read the library or blindly trust the community. Currently bigger, security-focused companies have to manage a white-list internally, which doesn't feel adequate.

Countersigning could look like a PR to add a file at /signatories on a package repo OR be centralised in the package manager system itself - the latter is more ergonomic but less decentralised.

I've been thinking about this for a while and want to propose this to other package repositories too, but I think here might be an opportunity to think about the problem from the beginning.

Richard Feldman (Nov 19 2021 at 21:27):

My thinking is that a good package manager would be decentralised or maintained by the community

so to me, one of the "now that I've had this, I can never go back" features of Elm's package manager is that it enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package, I have to bump the major version number or the package repo won't accept it.

I don't think that's possible if you have a decentralized package management system, unless I'm missing something!

Jeroen Engels (Nov 19 2021 at 21:27):

I don't know about identity/authentication, but I feel like mandatory author name is the way to go. I like 1, but maybe with hosting of the packages outside of GitHub, seeing as that caused problems for Elm :thinking:

Richard Feldman (Nov 19 2021 at 21:28):

Are packages in this sense platforms AND source modules? Are they compiled artefacts or source? (Is that going to be a problem for platforms?)

great question!

publishing shared Roc code (the typical case) would just be plain old Roc code, but publishing a platform as a package would involve shipping both the Roc code and the compiled binaries for the platform (for whatever targets that platform supports - e.g. macOS binaries, Windows binaries, Linux binaries, wasm binaries, etc.)

Richard Feldman (Nov 19 2021 at 21:32):

a concern about the Deno design is that it's susceptible to both left-pad (package deletions breaking everything) as well as malicious takeovers and re-publishing (which can be mitigated if you've downloaded it before and have a checksum already, but where you're in deep trouble if you're installing it for the first time on a new project on a new machine)

Lawrence Job (Nov 19 2021 at 21:39):

Richard Feldman said:

enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package

Wow, not coming from Elm so didn't realise this! That's fantastic and well worth having. In a blockchain future, things could be rejected from the ledger if they don't pass some kind of test (like the versioning requirement), and in the complicated scenario, this could be a requirement for third party endorsement. (Some kind of endorsement bot that runs tests on packages?)

... but for a busy and small project I can see how this could be way too complicated for day 0. I know I'm being a bit radical here, but thought it better to say than not!

Fwiw I've never been bitten by an API change in third party code, although I do try to minimise surface area in my (enterprise) projects to avoid things like this in the first place. This wouldn't preclude a breaking change at the logic level, which is much more likely to break things in a version bump.

Lawrence Job (Nov 19 2021 at 21:45):

In the case of shipping compiled binaries for a given platform (which I can absolutely understand), would the package manager start to get into very sketchy territory? what safeguards can it provide?

if it's impossible to expect a roc app developer to have the whole toolchain to compile the platform, would the package manager need to be custodian for the build process - taking a platform in a supported language and producing the build artefacts itself? maybe off topic, but I'm very curious to know how this would be solved

clearly platform developers will need to be held to an infinitely higher standard than app developers... maybe even whitelisted by the community at first -- could roc ship in 'safe mode' for beginners?

edit after more thought: or ship with a limited bunch of endorsed platforms (web, console, playground, embedded) and let developers download/share at own risk?

Martin Stewart (Nov 19 2021 at 21:50):

Richard Feldman said:

My thinking is that a good package manager would be decentralised or maintained by the community

so to me, one of the "now that I've had this, I can never go back" features of Elm's package manager is that it enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package, I have to bump the major version number or the package repo won't accept it.

I don't think that's possible if you have a decentralized package management system, unless I'm missing something!

Can't you derive the version numbers by looking at the source? In other words, would it be possible for packages to not include version numbers and instead you'd determine it by looking at the API changes between each version?

Richard Feldman (Nov 19 2021 at 21:52):

Can't you derive the version numbers by looking at the source?

that's possible, but it requires downloading and type-checking all the sources of every version that's ever been published, which could take awhile for a project that's had a lot of releases :sweat_smile:

Richard Feldman (Nov 19 2021 at 21:58):

if it's impossible to expect a roc app developer to have the whole toolchain to compile the platform, would the package manager need to be custodian for the build process - taking a platform in a supported language and producing the build artefacts itself? maybe off topic, but I'm very curious to know how this would be solved

I don't think the Roc compiler or package manager should become coupled to another language's compiler. Then we get into situations like there being demand for a Roc release every time a new version of Rust comes out, or Zig, or...etc. I also think Roc application authors should be able to drop a roc executable onto a fresh operating system install and be able to do full Roc development without needing anything else, so they also shouldn't need to build hosts.

Putting those together, having platform authors build their own host binaries for each target, and upload them directly to the package repo, seems like the way to go! :big_smile:

In the case of shipping compiled binaries for a given platform (which I can absolutely understand), would the package manager start to get into very sketchy territory? what safeguards can it provide?

I can't think of any safeguards that it could realistically provide, to be honest - I think it's one of those "be very sure that you trust the author of this platform before you install it!" situations. You can presumably verify the binaries yourself by building from source and checking that they're identical to the binaries that got downloaded, but I'm not sure how the package manager could automate that without getting into the "building hosts from source" game!

Richard Feldman (Nov 19 2021 at 22:02):

clearly platform developers will need to be held to an infinitely higher standard than app developers... maybe even whitelisted by the community at first -- could roc ship in 'safe mode' for beginners?
or ship with a limited bunch of endorsed platforms (web, console, playground, embedded) and let developers download/share at own risk?

I think platform authors definitely need to be treated with more caution than other package authors, but one thing I learned from Elm is that a lot of people feel very strongly that "blessed packages" with special status are unfair to other package authors.

I think having a "tutorial" platform ship with the language for purposes of an interactive tutorial seems reasonable, but I'd imagine its API would be limited to just the tutorial exercises, so it wouldn't really compete with "real" platforms

Lawrence Job (Nov 19 2021 at 22:10):

How many languages do we expect there to be realistically? If it's zig, rust, swift and a couple others, is it possible for the package manager to take the source and compile it itself, at least showing some kind of chain of custody?

I totally agree that blessed packages are harmful, but maybe platforms are potentially dangerous enough that they need vetting. I know I'd feel better as a developer if I knew that someone had checked it. Rn we're asking developers to trust Roc, and the platform, with their systems. Eventually I think trusted platforms would eventually emerge and there'd be a de facto blessed list anyway.. maybe that's where third party endorsements come in?

ps love the idea of a tutorial platform

Lawrence Job (Nov 19 2021 at 22:12):

Richard Feldman said:

Can't you derive the version numbers by looking at the source?

that's possible, but it requires downloading and type-checking all the sources of every version that's ever been published, which could take awhile for a project that's had a lot of releases :sweat_smile:

interesting - with elm is that done by the package manager or before publishing?

Brendan Hansknecht (Nov 19 2021 at 22:15):

Any language that speaks c abi is likely a valid host. So that is essentially all languages, though most likely a small handful will be the common ones (zig, rust, c/c++).

As for platforms, we should probably attempt to require source and binaries be published as part of packages. Then users can decide whether to just go with the binary or build from source. Maybe roc could even support some sort of build or make command for users who opt into source, but that would probably require the user to install the correct compiler and dependencies. The build command would just enable easier source dependencies.

Richard Feldman (Nov 19 2021 at 22:18):

I worry about a "build from source" option ending up being a lot of complexity that's ultimately security theater in practice

Brendan Hansknecht (Nov 19 2021 at 22:19):

Still doesn't stop someone from posting a malicious binary, but it is barely different than someone maliciously updating a library. Not like most people will read the code before pulling an update.

Richard Feldman (Nov 19 2021 at 22:19):

like what percentage of people who have the ability to build from source automatically are going to audit that source with the level of care that would be required to notice that something malicious has slipped in?

Richard Feldman (Nov 19 2021 at 22:19):

because that amount of time is probably measured in days, not minutes

Richard Feldman (Nov 19 2021 at 22:20):

and at that point, are we really saving them a significant amount of time compared to asking them to build that binary and compare it to the one that was downloaded automatically? :big_smile:

Brendan Hansknecht (Nov 19 2021 at 22:20):

I think they would just be opting out of the automatic download fully

Brendan Hansknecht (Nov 19 2021 at 22:20):

No comparing

Richard Feldman (Nov 19 2021 at 22:20):

well but the only difference there is hard disk space usage haha

Richard Feldman (Nov 19 2021 at 22:21):

like if the package manager downloads a binary and puts it in a folder, that's not harmful until I try to run it

Brendan Hansknecht (Nov 19 2021 at 22:21):

That being said, i do agree that it would mostly be security theater assuming that the source was available and build able otherwise.

Richard Feldman (Nov 19 2021 at 22:21):

so as long as I verify it before it gets run, there's no harm

Lawrence Job (Nov 19 2021 at 22:23):

I think there's definitely an element of trust. On the subject of ergonomics, a developer wants to trust that a platform does what it says it will do, and one of Roc's selling points is the implicit promise that a platform is a portable abstraction and safe sandbox for their code to run in.

As you've said, a developer wanting to trust a platform definitely doesn't mean they will or are even qualified to read through the source of a platform, so there has to be some kind of community voting aspect to this..

Richard Feldman (Nov 19 2021 at 22:24):

interestingly, an opaque C binary is precisely as safe to run as a Python script you didn't read, but for some reason the latter is generally considered innately more trustworthy - which leads basically everyone to run them without reading them :laughing:

Brendan Hansknecht (Nov 19 2021 at 22:24):

But I would totally agree that depending on x giant web framework written in rust vs depending on x giant web framework compiled in to a roc platform is mostly the same. Either way it is too large for most groups to verify. So you are running on something you trust is written correctly and not malicious. being based on the rust just means someone could look at the code and could make fixes

Lawrence Job (Nov 19 2021 at 22:26):

(replying to an earlier question and consequently ruining the flow of conversation :innocent: ) I think the best a package manager can do is be as transparent as possible, and let someone who is interested, take a look. I think public source+artefacts is bare minimum. Is there a simpler way to prove that a binary came from a source than needing to build it?

Lawrence Job (Nov 19 2021 at 22:27):

I totally agree @Richard Feldman - but it's possible that Roc's platform+application split makes a difference here

Richard Feldman (Nov 19 2021 at 22:27):

yeah non-platform packages are extremely trustworthy :smiley:

Richard Feldman (Nov 19 2021 at 22:28):

but platform packages are no more trustworthy than any other langauge

Lawrence Job (Nov 19 2021 at 22:28):

So, if I understand the mood in the room, it sounds like platform vendors are just going to have to build up a reputation of trust? I think that's adequate for sure

Richard Feldman (Nov 19 2021 at 22:28):

yeah I think so

Lawrence Job (Nov 19 2021 at 22:29):

Awesome. Also, doesn't preclude a more elegant solution for v2 in 5 years (if there's a demonstrated need)

Richard Feldman (Nov 19 2021 at 22:29):

indeed!

Brendan Hansknecht (Nov 19 2021 at 22:30):

Yeah, might be good to expose some useless popularity metrics to help people understand trust a developer has garnished (GitHub stars, downloads, something else)

Richard Feldman (Nov 19 2021 at 22:30):

yeah I've been thinking about that too - Evan pointed out that a lot of those metrics have a lot of problems

Richard Feldman (Nov 19 2021 at 22:31):

for example, "downloads" is a proxy for "how often does this get run on CI builds that aren't properly configured to cache downloaded packages?" - not necessarily "how many people are using this?"

Lawrence Job (Nov 19 2021 at 22:31):

The CLI could at least say "hey just so you know, this package hasn't been downloaded much, are you sure you want this?" -- (I'm really interested in Roc being a language to learn to code with, so) maybe some naive advice during installation is all we need

Richard Feldman (Nov 19 2021 at 22:31):

and GitHub stars can easily be more of a measure of blog posts and HN exposure than quality/trustworthiness

Brendan Hansknecht (Nov 19 2021 at 22:32):

Yeah, that is why I called them useless. They are proxies and can easily be propped up, but have some form of merrit

Richard Feldman (Nov 19 2021 at 22:33):

a concern about that is that it means attackers have an easy way to fake trustworthiness: automate downloads until the package has been downloaded a bunch before they launch their attack

Lawrence Job (Nov 19 2021 at 22:34):

And naturally it would create a stable equilibrium for popular packages, creating the de facto blessed problem again, because newer packages would find it hard to overcome this hill

Richard Feldman (Nov 19 2021 at 22:35):

Evan ended up with an unusual algorithm that's worked well in practice but which raises some obvious objections: Elm packages are ranked by the number of times the author has given a talk at a dedicated Elm conference - https://github.com/elm/package.elm-lang.org/blob/d4d5a997a5d9d6622694c488e6a3ae9f537da761/src/backend/Memory.hs#L282

Lawrence Job (Nov 19 2021 at 22:36):

Now I know what sits on the other end of the spectrum from 'decentralised' :D

Richard Feldman (Nov 19 2021 at 22:38):

yeah - personally I care more about "resilient" than "decentralized" - for example, one general category of designs that appeals to me is "there's a single default Roc package index, but you can configure your local client to switch to a different one, and since all the data in the index is publicly available, people could mirror it and recreate the whole ecosystem on short notice if necessary"

Lawrence Job (Nov 19 2021 at 22:39):

Yeah .. I fully accept that I'm being pie-in-the-sky with my decentralised ideas... glad to have had the discussion though.

Lawrence Job (Nov 19 2021 at 22:43):

To that end, in reply to the first message option 1/2: maybe users can be from multiple trusted vendors, e.g. github:~lawrencejob publishes github:@org-name/package-name

Lucas Rosa (Nov 19 2021 at 22:49):

I think npm-style is the best. One off libraries can use a flat name like rand, a project can have many packages @namespace/package, and companies can group their open source/private stuff @myCompany/internal-code. I personally dislike the GitHub-username/repo format. I find seeing a bunch of usernames a waste characters

Lucas Rosa (Nov 19 2021 at 22:49):

I definitely 100% would not default to pulling from GitHub. Put pulling from GitHub or some url directly should be an available option

Lucas Rosa (Nov 19 2021 at 22:51):

https://hex.pm
this is the one for elixir

Lucas Rosa (Nov 19 2021 at 22:51):

private orgs/packages cost money and that should be more than enough to pay for hosting

Lawrence Job (Nov 19 2021 at 22:52):

I was just about to ask- is the package manager expected to be sponsored? This could be a tonne of bandwidth to worry about...

Lucas Rosa (Nov 19 2021 at 22:53):

this way companies can sign up with an org, and host private packages but since they aren't open source they gotta cough up some money :)

Lucas Rosa (Nov 19 2021 at 22:53):

then maybe have the code for the package manager setup to be open source and easily self-hostable and mirror the main registry

Lawrence Job (Nov 19 2021 at 22:53):

Lucas Rosa said:

I think npm-style is the best. One off libraries can use a flat name like rand, a project can have many packages @namespace/package, and companies can group their open source/private stuff @myCompany/internal-code. I personally dislike the GitHub-username/repo format. I find seeing a bunch of usernames a waste characters

Totally agree with wasting characters with usernames - if this package manager has its own ID space, I'd suggest a way to differentiate orgs from users from namespaces (different preceding character?)

The advantage of username namespaces is that it permits forks of packages without new names, which would work really well with Elm-style semver constraints

Lucas Rosa (Nov 19 2021 at 22:54):

if it's just a user package I would just not bother including it in the name in the deps file

Lucas Rosa (Nov 19 2021 at 22:54):

I totally dislike the go and Deno style urls

Lucas Rosa (Nov 19 2021 at 22:56):

Lawrence Job said:

Lucas Rosa said:

I think npm-style is the best. One off libraries can use a flat name like rand, a project can have many packages @namespace/package, and companies can group their open source/private stuff @myCompany/internal-code. I personally dislike the GitHub-username/repo format. I find seeing a bunch of usernames a waste characters

Totally agree with wasting characters with usernames - if this package manager has its own ID space, I'd suggest a way to differentiate orgs from users from namespaces (different preceding character?)

The advantage of username namespaces is that it permits forks of packages without new names, which would work really well with Elm-style semver constraints

I see what you mean, npm has the option for a user package to be flat or prefixed with a namespace (username on the registry)

Lucas Rosa (Nov 19 2021 at 22:56):

this should accommodate forks and stuff

Lucas Rosa (Nov 19 2021 at 22:57):

"hey this package name is taken, but you can use your namespace"

Lawrence Job (Nov 19 2021 at 22:57):

What about inverting it (optionally):

roc-math/lawrencejob@2.4.5

Lucas Rosa (Nov 19 2021 at 22:57):

could be cool, not used it that so looks funny initially

Lucas Rosa (Nov 19 2021 at 22:57):

Lawrence Job (Nov 19 2021 at 22:58):

I went down a mental rabbit hole of 'what if someone forks that' and had to stop myself..

Lawrence Job (Nov 19 2021 at 22:59):

This being Roc (whitespace aware) maybe it can be cleaner... roc-math by lawrencejob at 2.4.5

Lucas Rosa (Nov 19 2021 at 23:00):

 app "echo"
    packages { base: "platform", rand: "rand", thing: "user/thing", other: "org/other", rename: "org/something" }

Lucas Rosa (Nov 19 2021 at 23:01):

Lawrence Job said:

This being Roc (whitespace aware) maybe it can be cleaner... roc-math by lawrencejob at 2.4.5

hm kinda cute

Lucas Rosa (Nov 19 2021 at 23:01):

what's cool about the packages record right now is how you can essentially rename what a package is referenced as for free

Lawrence Job (Nov 19 2021 at 23:03):

Feels alien to me, but I haven't tried it. Scares me that the same package identifier could mean something else in an adjacent file...

Lucas Rosa (Nov 19 2021 at 23:04):

that should only work in an app header

Lucas Rosa (Nov 19 2021 at 23:04):

the other adjacent files in a project would be interface which don't have a packages field

Lucas Rosa (Nov 19 2021 at 23:05):

I copied the app header in the cli example for that and added some examples

Lawrence Job (Nov 19 2021 at 23:05):

Ah! I confused packages with imports! My bad!

Lucas Rosa (Nov 19 2021 at 23:06):

all good, imports is a valid field here but I omitted it

Lawrence Job (Nov 19 2021 at 23:07):

Newbie question - do packages have constraints as to which platforms they're compatible with? Or does the app need to inject any platform-specific behaviours when invoking library packages?

Lucas Rosa (Nov 19 2021 at 23:07):

yes and no

Lucas Rosa (Nov 19 2021 at 23:08):

if the package doesn't bother importing stuff from a platform to abstract over or something then it should be platform agnostic

Richard Feldman (Nov 19 2021 at 23:08):

One off libraries can use a flat name like rand

a downside of this is squatting: https://crates.io/users/swmon

you can create policies to transfer ownership, but then you have to decide what counts as "squatting" and enforce it on demand - which is a fraught policy to try to come up with, and enforcement is manual; if you don't have a paid Support team (which I think it's safe to assume we won't), who decides what to do in those cases?

this is why Rust crates don't attempt to prevent squatting, and why package name squatting is a huge and common complaint in the Rust ecosystem.

npm has a paid support team, but also the reason npm is now owned by Microsoft is that they couldn't keep the lights on with the revenue they were getting from private packages, given all their expenses :sweat_smile:

Lucas Rosa (Nov 19 2021 at 23:09):

squatting doesn't hurt anyone so I'd go with what rust did

Lucas Rosa (Nov 19 2021 at 23:09):

policing is tough work

Richard Feldman (Nov 19 2021 at 23:09):

prepare for an avalanche of complaints then :laughing:

Lucas Rosa (Nov 19 2021 at 23:09):

fair

Lucas Rosa (Nov 19 2021 at 23:09):

lmao

Lucas Rosa (Nov 19 2021 at 23:09):

not to say there shouldn't be thought put into that, just my initial reaction

Richard Feldman (Nov 19 2021 at 23:10):

totally!

Lawrence Job (Nov 19 2021 at 23:10):

(Even before MS bought them, they were responsible for taking a big chunk out of NPM's revenue stream by having Azure based private NPM repositories for big companies anyway -- among many other companies)

Lucas Rosa (Nov 19 2021 at 23:10):

if some squats rand, you can just do user/rand and move on

Lucas Rosa (Nov 19 2021 at 23:10):

offering both is an option

Lawrence Job (Nov 19 2021 at 23:10):

rand by lawrencejob :wink:

Lucas Rosa (Nov 19 2021 at 23:11):

or maybe default to registry-user/rand and give packages the ability to become single flat names after hitting a certain download number

Lawrence Job (Nov 19 2021 at 23:11):

That's a good idea! Although hits the blessed problem again

Lucas Rosa (Nov 19 2021 at 23:11):

oh true

Richard Feldman (Nov 19 2021 at 23:12):

the "mandatory namespacing" design at least means if someone tries to squat on cool-web-server/cool-web-server I can make rtfeldman/cool-web-server and people can discover that mine has a better reputation

(although honestly I think it's appealing to disallow having the package name be the same as the namespace, to discourage :point_up: )

Lucas Rosa (Nov 19 2021 at 23:12):

twitter check marks but for packages lol

Lucas Rosa (Nov 19 2021 at 23:12):

ok I'm sold, registry-user/package or registry-org/package is the most sane and balanced

Lawrence Job (Nov 19 2021 at 23:13):

Is differentiation possible? ~registry-user/abc and @registry-org/xyz ?

Richard Feldman (Nov 19 2021 at 23:14):

in Elm in practice it's worked out that everybody knows when you say elm-ui you mean mdgriffith/elm-ui, when you say elm-css you mean rtfeldman/elm-css, when you say elm-charts you mean terezka/elm-charts etc

Lucas Rosa (Nov 19 2021 at 23:14):

anything is possible, none of this exists yet!

Richard Feldman (Nov 19 2021 at 23:14):

I'm kinda surprised it's worked out so well, to be honest, but somehow it has! :big_smile:

Richard Feldman (Nov 19 2021 at 23:14):

squatting basically doesn't seem to happen

Lucas Rosa (Nov 19 2021 at 23:14):

yea I think that makes sense, as long as it's not coupled to github

Richard Feldman (Nov 19 2021 at 23:14):

yeah I don't think we should couple to GH

Lawrence Job (Nov 19 2021 at 23:14):

Would make a very nice ide feature to autocomplete/search if that isn't planned already...

Lucas Rosa (Nov 19 2021 at 23:15):

the best part might end up being writing the registry in roc itself, at first tho maybe in something else just to get the ball rolling idk

Lucas Rosa (Nov 19 2021 at 23:16):

Andrews idea of failing on major version mismatches sounded reasonable as well

Richard Feldman (Nov 19 2021 at 23:16):

yeah that's what Elm does, and I like it :thumbs_up:

Lucas Rosa (Nov 19 2021 at 23:19):

I'm gonna squat packages on myself rvcas/rand on day one

Lucas Rosa (Nov 19 2021 at 23:19):

you'll never stop me

Lawrence Job (Nov 19 2021 at 23:19):

Two last thoughts before I stop philosophising:

I kinda think that if someone is willing to publish a package, it needs a namespace/org rather than a username-namespace because it surely stops being about them at that point? What if the author disappears? Succession is weird.
What if namespaces are mandatory in v.1 but people can apply for a short alias? Can they be kept in a public repo with PRs/discussion/community input?

Lawrence Job (Nov 19 2021 at 23:20):

Lucas Rosa said:

you'll never stop me

buys the @rand account *

Lucas Rosa (Nov 19 2021 at 23:21):

never let people hard delete their account
never let people delete a package

Lucas Rosa (Nov 19 2021 at 23:21):

1 can be tricky tho

Lucas Rosa (Nov 19 2021 at 23:21):

good thought to bring up

Lucas Rosa (Nov 19 2021 at 23:22):

how does one transfer a package to someone else/org

Lawrence Job (Nov 19 2021 at 23:22):

if it's always in an org, then it's a matter of adding more users as contributors, right?

Lucas Rosa (Nov 19 2021 at 23:23):

I see so:

user makes an account with the registry
user makes a namespace
user may or may not add other users to that namespace essentially making that an org/team

Lucas Rosa (Nov 19 2021 at 23:24):

can a user make a namespace with the same name as their username?

Lawrence Job (Nov 19 2021 at 23:24):

I don't see why not - they would be orthogonal ID domains

Lucas Rosa (Nov 19 2021 at 23:24):

for what it's worth I think supporting sign ups via GitHub would be fine

Lawrence Job (Nov 19 2021 at 23:25):

oh.. until I create a rvcas namespace and capitalise on your reputation..

Lucas Rosa (Nov 19 2021 at 23:25):

right

Lawrence Job (Nov 19 2021 at 23:25):

Lucas Rosa said:

for what it's worth I think supporting sign ups via GitHub would be fine

agreed, considering it's an openid provider, if GH were not trusted, could just keep adding more and more providers..

Lucas Rosa (Nov 19 2021 at 23:26):

should names be unique across both users and orgs?

Lucas Rosa (Nov 19 2021 at 23:26):

that would solve the problem you just mentioned

Lawrence Job (Nov 19 2021 at 23:26):

I'm not sure how big the problem would be.. afaik nobody's ever successfully stolen an identity on the much more permissive package managers

Lucas Rosa (Nov 19 2021 at 23:27):

I wonder if there's a way to incentivize people to not do silly things instead of putting up fences

Lawrence Job (Nov 19 2021 at 23:27):

... at least I never heard about it because it became obvious...

Lucas Rosa (Nov 19 2021 at 23:27):

what I've learned from blockchain is that if you make someone spend money they won't misbehave

Lawrence Job (Nov 19 2021 at 23:27):

I think that's it - eventually we just need to trust the community to be good people...

Lawrence Job (Nov 19 2021 at 23:28):

Lucas Rosa said:

what I've learned from blockchain is that if you make someone spend money they won't misbehave

I don't know what you mean - I bought the mona lisa for $20 - nobody scammed me!

Lucas Rosa (Nov 19 2021 at 23:29):

what if holding a name that has no activity EVER or some period of time, then it costs money. I can see where maybe this will be an issue for "finished" projects that only take bug fixes

Lawrence Job (Nov 19 2021 at 23:29):

This is an interesting point to raise this issue I was bike-shedding on last night..

I honestly don't think it should be part of the language spec, but maybe it is a consideration for a package manager

https://github.com/rtfeldman/roc/issues/1862

Lucas Rosa (Nov 19 2021 at 23:30):

I'm not advocating one way or the other btw, I'm just letting some thoughts flow for brainstorming

Lawrence Job (Nov 19 2021 at 23:30):

Lucas Rosa said:

what if holding a name that has no activity EVER or some period of time, then it costs money. I can see where maybe this will be an issue for "finished" projects that only take bug fixes

what if a project has no downloads for a certain time?

Lucas Rosa (Nov 19 2021 at 23:30):

or like never been pushed to

Lawrence Job (Nov 19 2021 at 23:30):

I am doing the same - I certainly don't believe blockchain is the answer to a package manager in 2021

Lucas Rosa (Nov 19 2021 at 23:30):

then it gets reclaimed

Lawrence Job (Nov 19 2021 at 23:31):

I think your point about pushing to 'finished' projects was valid - we can't use that

Richard Feldman (Nov 19 2021 at 23:31):

what do people think of the Maven approach to establishing identity? Summary:

my identity is a top-level domain I own, e.g. rtfeldman.com - so maybe I publish a package like rtfeldman.com/roc-cli
the first time I publish a to that namespace, I have to put a public key on that domain to prove that I own it, and I use the corresponding private key to "authenticate" myself when publishing packages - the package index remembers this public key, and uses it to authenticate you in the future
if I ever don't renew that domain (e.g. the new owner takes down the public key) or I lose my private key (or the new owner of the site doesn't have it, which is potentially good if the new owner is malicious), by default there's no way to recover access, and packages in that namespace are effectively frozen & can't be published to anymore

Lucas Rosa (Nov 19 2021 at 23:32):

clever

Lawrence Job (Nov 19 2021 at 23:32):

Very fair... it's certainly asking a lot of a developer, though

Lucas Rosa (Nov 19 2021 at 23:33):

I like the friction, means bad actors have more hoops to jump

Richard Feldman (Nov 19 2021 at 23:33):

yeah I don't mind friction in publishing packages, especially for the first time

Lawrence Job (Nov 19 2021 at 23:33):

If we're asking them to spend $10 on a domain, maybe they could just spend $10 to register a namespace :innocent:

Lucas Rosa (Nov 19 2021 at 23:33):

and they even have to spend money on a domain to cause trouble, this disincentivizes even more

Lucas Rosa (Nov 19 2021 at 23:33):

I like

Richard Feldman (Nov 19 2021 at 23:33):

like optimizing for "let's barf packages into the ecosystem as fast as possible" is not optimal imo

Lucas Rosa (Nov 19 2021 at 23:34):

exactly

Lucas Rosa (Nov 19 2021 at 23:34):

Lawrence Job said:

If we're asking them to spend $10 on a domain, maybe they could just spend $10 to register a namespace :innocent:

not a bad point

Richard Feldman (Nov 19 2021 at 23:34):

that's interesting! I never thought of that :thinking:

Lucas Rosa (Nov 19 2021 at 23:35):

although there is perceived image there and someone can use their domain for other stuff so it's more value to you using the domain route

Richard Feldman (Nov 19 2021 at 23:35):

I have mixed feelings about it, but it's definitely intereting

Lawrence Job (Nov 19 2021 at 23:35):

I meant it in jest but it might actually be a good way to fund the repo

Lucas Rosa (Nov 19 2021 at 23:35):

private packages and namespaces

Lawrence Job (Nov 19 2021 at 23:37):

Another thought - back to pie in the sky I'm afraid, but worth considering:

All repos so far allow orgs and people to register with any username... nobody has ever required a person to verify their identity (e.g. when you're asked show a passport to your webcam when you sign up to coinbase). Is this because it's the right thing to do, or because the tech wasn't there?

Lucas Rosa (Nov 19 2021 at 23:37):

could even have it plan based. 5 bucks 5 namespaces, 10 bucks 5 namespaces and 5 private packages, 20 bucks unlimited

Lucas Rosa (Nov 19 2021 at 23:37):

:shrug:

Lucas Rosa (Nov 19 2021 at 23:37):

no charge on package count

Lawrence Job (Nov 19 2021 at 23:38):

Reason I mention this is because what if anyone can upload to the repo, but those who verify their identity get a twitter-style-tick and their contributions are considered more trusted?

Lucas Rosa (Nov 19 2021 at 23:38):

yea all that just depends how involved we would want to be

Lawrence Job (Nov 19 2021 at 23:39):

There are third parties who can do all that gross stuff for us these days (taking care of the privacy caveats etc)

Richard Feldman (Nov 19 2021 at 23:39):

yeah speaking of friction, my thinking on how to make the repo super low cost is to make the contents of the packages hosted somewhere else - e.g. when I go to publish my package, I have to provide a URL (which in practice will presumably almost always be a GitHub Release, but doesn't have to be) that end users will download it from.

on publish, the index writes down the hash of the contents of the package, and sends that to the client - so if someone takes over that URL (whatever it was) and tries to change the contents, it'll just fail to install.

we can also back up all the contents of all the packages (which is way cheaper than serving them!), so if any of them ever starts 404ing (like left-pad), we have all the data necessary to restore them to a different URL on short notice. Ideally, we could get community volunteers (or companies) to offer to run mirror networks to be used as fallbacks in case that happens, and the clients would also validate the mirrors against the hashes in the index, to prevent malicious mirrors

Lucas Rosa (Nov 19 2021 at 23:39):

cool

Richard Feldman (Nov 19 2021 at 23:39):

the hash being stored in the index (and the contents being backed up, and ideally mirrored) makes it different from the Deno approach

Lawrence Job (Nov 19 2021 at 23:39):

GitHub action to, on release, send the version and hash to the repo?

Richard Feldman (Nov 19 2021 at 23:40):

sure!

Lucas Rosa (Nov 19 2021 at 23:40):

yea I was about to say how much do we want to distribute this system

Lucas Rosa (Nov 19 2021 at 23:40):

would be cool if volunteers could spin up a mirror with a few commands to help the network

Richard Feldman (Nov 19 2021 at 23:40):

yeah totally!

Lawrence Job (Nov 19 2021 at 23:40):

Sorry wrote that before your messages popped up..

Richard Feldman (Nov 19 2021 at 23:40):

roc volunteer :big_smile:

Lucas Rosa (Nov 19 2021 at 23:41):

even provide config files, scripts, terraform, and k8 setups so people can host it in a bunch of ways

Lawrence Job (Nov 19 2021 at 23:41):

(whispers blockchain)

Lucas Rosa (Nov 19 2021 at 23:41):

Richard Feldman said:

roc volunteer :big_smile:

woah, that's kinda cool

Lawrence Job (Nov 19 2021 at 23:42):

If the index is just (namespace, name, version, hash) - how big might that be?

Lucas Rosa (Nov 19 2021 at 23:42):

Lawrence Job said:

(whispers blockchain)

I've definitely thought about it tbh, I just can't come up with a good economic model for it

Lawrence Job (Nov 19 2021 at 23:42):

Roccoin :D

Lawrence Job (Nov 19 2021 at 23:42):

This is how it all goes downhill

Lucas Rosa (Nov 19 2021 at 23:43):

I think there are projects related to this out there

Lucas Rosa (Nov 19 2021 at 23:43):

I've never checked but I think I remember a "git"coin kinda thing

Lucas Rosa (Nov 19 2021 at 23:43):

there's also radicle

Lucas Rosa (Nov 19 2021 at 23:43):

https://radicle.xyz

Lucas Rosa (Nov 19 2021 at 23:43):

I tried it out, it's kinda neat

Lawrence Job (Nov 19 2021 at 23:44):

100,000 packages with 100 stored versions and a cost of (40+40+20+40)bytes = 1.4GB + overhead, so Roc package manager nodes would be feasible

Lawrence Job (Nov 19 2021 at 23:45):

(and it's plenty small enough for a blockchain but I know I'm going to get booted from the server if I bring it up again)

Lucas Rosa (Nov 19 2021 at 23:45):

why I don't hate blockchain

Lucas Rosa (Nov 19 2021 at 23:45):

it's contentious but we also aren't shilling coins, just thinking about package management and auth

Lawrence Job (Nov 19 2021 at 23:46):

I'm not saying that it's a good thing, but it would get headlines

Lucas Rosa (Nov 19 2021 at 23:46):

oh yea for sure

Lawrence Job (Nov 19 2021 at 23:46):

it doesn't have to be a widely distributed ledger either - it could happily run on 10/100s of volunteer nodes if that's a model we're exploring

Lucas Rosa (Nov 19 2021 at 23:47):

true

Lucas Rosa (Nov 19 2021 at 23:47):

I wonder what radicle does to work

Lucas Rosa (Nov 19 2021 at 23:47):

I didn't have to pay or use coins to use it

Lawrence Job (Nov 19 2021 at 23:52):

As far as I can tell, the code ledger is only distributed across systems who care about it, so you're not using any compute resource other than your own, but I'm not an expert.

Is it possible to build a centralised package manager for v1 with the notion of and pathway to decentralising it into a ledger for 'volunteers'? Would require some clever design at the beginning

Lucas Rosa (Nov 19 2021 at 23:53):

very cool

Lawrence Job (Nov 19 2021 at 23:54):

One point about leaving the artefacts on github/providers. Assuming we don't have any issues of trust, etc, would the providers be ok with all of the hammering connections they'd get from the package manager client? Is that something other package managers already do/have solved?

Richard Feldman (Nov 19 2021 at 23:58):

GitHub releases are basically designed to be used for this sort of thing, so seems fine

Richard Feldman (Nov 19 2021 at 23:59):

we also asked someone at GitHub in the early Elm days if they'd be ok with it and they said "yeah we don't care as long as you aren't cloning the entire repo, just getting one commit"

Richard Feldman (Nov 19 2021 at 23:59):

which is what homebrew does, for example

Lucas Rosa (Nov 19 2021 at 23:59):

swift pulls from GitHub too I think

Richard Feldman (Nov 20 2021 at 00:00):

so my thinking is basically "how can we use GitHub for hosting in practice without being outright coupled to it"

Lawrence Job (Nov 20 2021 at 00:02):

Seems like

a) github in the identifier as mentioned above (github:lawrencejob/projectname) or
b) assume github for everything unless overridden in config
c) option a + gthub is implied, so you don't have to add it unless you want to be explicit or use another provider
d) use a completely different ID domain (as discussed above) and have github actions or similar to fire updates to the package manager repo with URL to the specific release artefact plus the hash (or better still, have the pusher sign it with a public key rather than just a hash, solving auth altogether)

Lawrence Job (Nov 20 2021 at 00:06):

In the case of D) that means the roc package manager can be completely naive and take signed hashes and URLs

Richard Feldman (Nov 20 2021 at 00:55):

a challenge we've seen with Elm of "GitHub for ID" is that people change their GitHub usernames sometimes :sweat_smile:

Richard Feldman (Nov 20 2021 at 00:55):

it's caused problems in practice

Lawrence Job (Nov 20 2021 at 00:58):

They of course have underlying integer IDs - when you change a username, does it free up the old username? Is it enough for the PM to just use whatever ID was connected to their account to start with (if the final design does indeed use usernames)?

Richard Feldman (Nov 20 2021 at 01:03):

either way, the problem is that there's all these packages out there referring to oldname/foo

Richard Feldman (Nov 20 2021 at 01:03):

by that name in their code

Richard Feldman (Nov 20 2021 at 01:05):

so the package manager would have to at a minimum detect when that happens and set up redirect rules

Richard Feldman (Nov 20 2021 at 01:06):

and apparently you can indeed resurrect old GitHub usernames: https://www.theregister.com/2018/02/10/github_account_name_reuse/

Lawrence Job (Nov 20 2021 at 01:35):

It sounds like the 'correct' solution is to accept that there is no uniqueness (only the illusion of) constraint in the GitHub username domain and establish a new one:
a) either a new one that is orthogonal or
b) a new one that bears uncanny resemblance and there's ongoing effort to maintain a link (redirects, etc)

github underlying ID for identity verification <- 1:1-> (new, unique username domain) <- 1:m -> package

Richard Feldman (Nov 20 2021 at 01:51):

yeah I'm surprised how appealing the Maven design seems, all things considered

Lucas Rosa (Nov 20 2021 at 02:59):

yea same

Jeroen Engels (Nov 20 2021 at 07:39):

my identity is a top-level domain I own, e.g. rtfeldman.com - so maybe I publish a package like rtfeldman.com/roc-cli

I think the hurdle to publich might be too hard. Say if my company (humio.com for instance) wants to open-source something, then I need to ask the people handling the marketing website to add/serve (and maintain) a public key.

Also, I don't know why, but I feel like you'll be more likely to have fake accounts like humlo.com/xyz with "malicious" packages :sweat_smile:

Richard Feldman (Nov 20 2021 at 12:38):

I dunno, Maven has the second highest number of packages of any package repo (only npm has more) so although that may be a significant hurdle in some cases, I have a hard time concluding that Maven's auth/identity design is keeping it from realizing its potential! :big_smile:

Richard Feldman (Nov 20 2021 at 12:39):

typosquatting is definitely a problem for every package system past a certain size, so we'll need to think about how to mitigate that regardless I think

Johannes Maas (Nov 20 2021 at 14:36):

I'm not sure whether this might be a different topic: I think having a strong default package system is really important. In addition, having a way to easily import packages from arbitrary sources (file system or URL) would allow for great flexibility.

Johannes Maas (Nov 20 2021 at 14:43):

So having good defaults that work most of the time would be enough, and the package system could rely on those alternatives for the cases where the defaults don't work as well.

Christian Dereck (Nov 20 2021 at 16:23):

I really like the idea to have third parties be able to sign releases. Especially for platforms. Like you said, the platforms are kind of a trust thing. Having a big/trusted company (if roc is able to get them to use roc) say: yes, we analysed that package and I seems save to use would be nice. Or you could pay for an audit and have that reflected directly in the package.

If you don't allow to delete a package but don't host it yourself, what is the purpose? I mean if the package gets deleted, you only have a dead link.
On the other side, if a create signs a package you could have mirrors for packages. And if someone decides to delete the original hosted version a mirror could seamlessly take over. The only problem is to manage the key infrastructure. But if you want to be able to sign packages, you need something similar nonetheless?

Not sure I am a fan of the domain owner solution. This adds friction and I am not sure this would stop an attacker. Attacking the ecosystem would be much more expensive than a few euro for a domain. Maven is an established ecosystem, roc has to get there first. And I am not sure that it would add anything that a key signing solution wouldn't add. Isn't there also a problem with utf-8 domains? In this case, you'd also force a package creator to take care of that. (or check when adding to the package repository)

Another aspect to the naming: If you only have 'simple' names, you can't distinguish between official packages and third party ones based on the name.

And a +1 for e.g. direct git urls.

Lawrence Job (Nov 20 2021 at 19:07):

Christian Dereck said:

I really like the idea to have third parties be able to sign releases.

Thanks for saying - I thought it went down like a lead balloon, but I think would be transformative for package managers. I like the idea of an audit/accreditation system growing naturally, too..

If you don't allow to delete a package but don't host it yourself, what is the purpose? I mean if the package gets deleted, you only have a dead link.

Ahhggg good point. Does this rule out getting away without hosting the files? I'm starting to think it might. The mirror point is interesting - could it be used as a fallback for if the original package disappears? Seems like the package manager's going to need a sponsor with deep pockets if the language takes off if preventing deletion is a required feature.

Tim Whiting (Nov 30 2021 at 00:19):

Dart has a concept of verified publishers (where you verify that you own a domain) that is orthogonal to the naming of the package. To publish a package, all you have to do is sign in with a google account, but you can at any time switch to a verified publisher by linking your domain name. (there is the theoretical problem of name squatting, but since you publish with your gmail / domain name you'd probably be looked down in the community for doing so, and the dart team handles taking down malicious or bad packages when they are reported, which is pretty infrequent). When using a verified publisher you can authorize multiple people to upload the package (and the package page displays the email addressess of the uploaders). Verified packages don't change the package search or give preference, but does add a badge after the name of the package. Instead packages can get points for following formatting / style conventions, providing documentation, supporting multiple platforms, passing static analysis, having up to date dependencies, and supporting null safety. These contribute to a score that factors into the search. Additionally it has Likes / Popularity measures, that factor in somewhat, but not as much and end up remapped from a range of [0-1] to [0.5-1] to account for new packages, and I've found that the popularity measures / likes are useful to differentiate between similarly named packages that are trying to gain from the popularity of the original package. A lot of the weight of the search is based on a fuzzy match in the readme / package name. They do have the advantage that hosting the packages and the package documentation is supported by Google.

It also has a mechanism to make the package manager use a different package server, and the package server is open-sourced so anyone can host their own package server which could either proxy to the normal server, or whatever you want.

The search ranking: https://github.com/dart-lang/pub-dev/blob/master/doc/search.md
Quality metrics: https://pub.dev/help/scoring
And publisher / publishing: https://dart.dev/tools/pub/publishing

Thought it might be useful to have another perspective, since many of you are probably not as familiar with dart.

Lucas Rosa (Nov 30 2021 at 02:18):

cool idea

Last updated: Jul 23 2026 at 13:15 UTC