so Roc doesn't have a package manager yet, but one of the design questions I've been thinking about for a long time is how to deal with package author identity.
This thread talks about a few different options, but as far as I can tell they boil down to a few different approaches:
rtfeldman/elm-css corresponds to the repo of the same name.randreact is un-namespaced, but @google/cloud-analytics isorg.springframework:springhttps://example.com/my/packageI'd love to hear different perspectives on what Roc should do here!
I don't have much in the way of contexts or ideas here, but I am definitely pro explicit namespaces. Personally, 1 and 5 (assuming forced https) look the most reasonable to me.
Though both rely on external sources for hosting and can more easily be targeted by a malicious user...which is not optimal.
My thinking is that a good package manager would be decentralised or maintained by the community, but able to use big repos like Github as a backend. One day this will actually be a decent use-case for blockchain, but I think that's a bit optimistic in 2021.
Signed or verifiable source
** Who made it?
** Is anyone else willing to countersign versions of a package as 'trusted'?
Immutable history of published code
Noob question: Are packages in this sense platforms AND source modules? Are they compiled artefacts or source? (Is that going to be a problem for platforms?)
Countersigning
I'd really like to see a package manager that allows third parties to audit/advocate for a library, and for people to choose which authorities they're willing to trust. Hypothetically, it would be nice to see that roc-math (or whatever) has been countersigned as safe by teams/people at Google, NoRedInk, whatever (on a version by version basis), where users can decide 'yeah I trust Apple but not NRI'.
Decentralising has the issue that anyone can contribute a package, and the onus is on the developer to fully read the library or blindly trust the community. Currently bigger, security-focused companies have to manage a white-list internally, which doesn't feel adequate.
Countersigning could look like a PR to add a file at /signatories on a package repo OR be centralised in the package manager system itself - the latter is more ergonomic but less decentralised.
I've been thinking about this for a while and want to propose this to other package repositories too, but I think here might be an opportunity to think about the problem from the beginning.
My thinking is that a good package manager would be decentralised or maintained by the community
so to me, one of the "now that I've had this, I can never go back" features of Elm's package manager is that it enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package, I have to bump the major version number or the package repo won't accept it.
I don't think that's possible if you have a decentralized package management system, unless I'm missing something!
I don't know about identity/authentication, but I feel like mandatory author name is the way to go. I like 1, but maybe with hosting of the packages outside of GitHub, seeing as that caused problems for Elm :thinking:
Are packages in this sense platforms AND source modules? Are they compiled artefacts or source? (Is that going to be a problem for platforms?)
great question!
publishing shared Roc code (the typical case) would just be plain old Roc code, but publishing a platform as a package would involve shipping both the Roc code and the compiled binaries for the platform (for whatever targets that platform supports - e.g. macOS binaries, Windows binaries, Linux binaries, wasm binaries, etc.)
a concern about the Deno design is that it's susceptible to both left-pad (package deletions breaking everything) as well as malicious takeovers and re-publishing (which can be mitigated if you've downloaded it before and have a checksum already, but where you're in deep trouble if you're installing it for the first time on a new project on a new machine)
Richard Feldman said:
enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package
Wow, not coming from Elm so didn't realise this! That's fantastic and well worth having. In a blockchain future, things could be rejected from the ledger if they don't pass some kind of test (like the versioning requirement), and in the complicated scenario, this could be a requirement for third party endorsement. (Some kind of endorsement bot that runs tests on packages?)
... but for a busy and small project I can see how this could be way too complicated for day 0. I know I'm being a bit radical here, but thought it better to say than not!
Fwiw I've never been bitten by an API change in third party code, although I do try to minimise surface area in my (enterprise) projects to avoid things like this in the first place. This wouldn't preclude a breaking change at the logic level, which is much more likely to break things in a version bump.
In the case of shipping compiled binaries for a given platform (which I can absolutely understand), would the package manager start to get into very sketchy territory? what safeguards can it provide?
if it's impossible to expect a roc app developer to have the whole toolchain to compile the platform, would the package manager need to be custodian for the build process - taking a platform in a supported language and producing the build artefacts itself? maybe off topic, but I'm very curious to know how this would be solved
clearly platform developers will need to be held to an infinitely higher standard than app developers... maybe even whitelisted by the community at first -- could roc ship in 'safe mode' for beginners?
edit after more thought: or ship with a limited bunch of endorsed platforms (web, console, playground, embedded) and let developers download/share at own risk?
Richard Feldman said:
My thinking is that a good package manager would be decentralised or maintained by the community
so to me, one of the "now that I've had this, I can never go back" features of Elm's package manager is that it enforces semantic versioning - like if I try to push a change that could cause a type mismatch for someone using my package, I have to bump the major version number or the package repo won't accept it.
I don't think that's possible if you have a decentralized package management system, unless I'm missing something!
Can't you derive the version numbers by looking at the source? In other words, would it be possible for packages to not include version numbers and instead you'd determine it by looking at the API changes between each version?
Can't you derive the version numbers by looking at the source?
that's possible, but it requires downloading and type-checking all the sources of every version that's ever been published, which could take awhile for a project that's had a lot of releases :sweat_smile:
if it's impossible to expect a roc app developer to have the whole toolchain to compile the platform, would the package manager need to be custodian for the build process - taking a platform in a supported language and producing the build artefacts itself? maybe off topic, but I'm very curious to know how this would be solved
I don't think the Roc compiler or package manager should become coupled to another language's compiler. Then we get into situations like there being demand for a Roc release every time a new version of Rust comes out, or Zig, or...etc. I also think Roc application authors should be able to drop a roc executable onto a fresh operating system install and be able to do full Roc development without needing anything else, so they also shouldn't need to build hosts.
Putting those together, having platform authors build their own host binaries for each target, and upload them directly to the package repo, seems like the way to go! :big_smile:
In the case of shipping compiled binaries for a given platform (which I can absolutely understand), would the package manager start to get into very sketchy territory? what safeguards can it provide?
I can't think of any safeguards that it could realistically provide, to be honest - I think it's one of those "be very sure that you trust the author of this platform before you install it!" situations. You can presumably verify the binaries yourself by building from source and checking that they're identical to the binaries that got downloaded, but I'm not sure how the package manager could automate that without getting into the "building hosts from source" game!
clearly platform developers will need to be held to an infinitely higher standard than app developers... maybe even whitelisted by the community at first -- could roc ship in 'safe mode' for beginners?
or ship with a limited bunch of endorsed platforms (web, console, playground, embedded) and let developers download/share at own risk?
I think platform authors definitely need to be treated with more caution than other package authors, but one thing I learned from Elm is that a lot of people feel very strongly that "blessed packages" with special status are unfair to other package authors.
I think having a "tutorial" platform ship with the language for purposes of an interactive tutorial seems reasonable, but I'd imagine its API would be limited to just the tutorial exercises, so it wouldn't really compete with "real" platforms
How many languages do we expect there to be realistically? If it's zig, rust, swift and a couple others, is it possible for the package manager to take the source and compile it itself, at least showing some kind of chain of custody?
I totally agree that blessed packages are harmful, but maybe platforms are potentially dangerous enough that they need vetting. I know I'd feel better as a developer if I knew that someone had checked it. Rn we're asking developers to trust Roc, and the platform, with their systems. Eventually I think trusted platforms would eventually emerge and there'd be a de facto blessed list anyway.. maybe that's where third party endorsements come in?
ps love the idea of a tutorial platform
Richard Feldman said:
Can't you derive the version numbers by looking at the source?
that's possible, but it requires downloading and type-checking all the sources of every version that's ever been published, which could take awhile for a project that's had a lot of releases :sweat_smile:
interesting - with elm is that done by the package manager or before publishing?
Any language that speaks c abi is likely a valid host. So that is essentially all languages, though most likely a small handful will be the common ones (zig, rust, c/c++).
As for platforms, we should probably attempt to require source and binaries be published as part of packages. Then users can decide whether to just go with the binary or build from source. Maybe roc could even support some sort of build or make command for users who opt into source, but that would probably require the user to install the correct compiler and dependencies. The build command would just enable easier source dependencies.
I worry about a "build from source" option ending up being a lot of complexity that's ultimately security theater in practice
Still doesn't stop someone from posting a malicious binary, but it is barely different than someone maliciously updating a library. Not like most people will read the code before pulling an update.
like what percentage of people who have the ability to build from source automatically are going to audit that source with the level of care that would be required to notice that something malicious has slipped in?
because that amount of time is probably measured in days, not minutes
and at that point, are we really saving them a significant amount of time compared to asking them to build that binary and compare it to the one that was downloaded automatically? :big_smile:
I think they would just be opting out of the automatic download fully
No comparing
well but the only difference there is hard disk space usage haha
like if the package manager downloads a binary and puts it in a folder, that's not harmful until I try to run it
That being said, i do agree that it would mostly be security theater assuming that the source was available and build able otherwise.
so as long as I verify it before it gets run, there's no harm
I think there's definitely an element of trust. On the subject of ergonomics, a developer wants to trust that a platform does what it says it will do, and one of Roc's selling points is the implicit promise that a platform is a portable abstraction and safe sandbox for their code to run in.
As you've said, a developer wanting to trust a platform definitely doesn't mean they will or are even qualified to read through the source of a platform, so there has to be some kind of community voting aspect to this..
interestingly, an opaque C binary is precisely as safe to run as a Python script you didn't read, but for some reason the latter is generally considered innately more trustworthy - which leads basically everyone to run them without reading them :laughing:
But I would totally agree that depending on x giant web framework written in rust vs depending on x giant web framework compiled in to a roc platform is mostly the same. Either way it is too large for most groups to verify. So you are running on something you trust is written correctly and not malicious. being based on the rust just means someone could look at the code and could make fixes
(replying to an earlier question and consequently ruining the flow of conversation :innocent: ) I think the best a package manager can do is be as transparent as possible, and let someone who is interested, take a look. I think public source+artefacts is bare minimum. Is there a simpler way to prove that a binary came from a source than needing to build it?
I totally agree @Richard Feldman - but it's possible that Roc's platform+application split makes a difference here
yeah non-platform packages are extremely trustworthy :smiley:
but platform packages are no more trustworthy than any other langauge
So, if I understand the mood in the room, it sounds like platform vendors are just going to have to build up a reputation of trust? I think that's adequate for sure
yeah I think so
Awesome. Also, doesn't preclude a more elegant solution for v2 in 5 years (if there's a demonstrated need)
indeed!
Yeah, might be good to expose some useless popularity metrics to help people understand trust a developer has garnished (GitHub stars, downloads, something else)
yeah I've been thinking about that too - Evan pointed out that a lot of those metrics have a lot of problems
for example, "downloads" is a proxy for "how often does this get run on CI builds that aren't properly configured to cache downloaded packages?" - not necessarily "how many people are using this?"
The CLI could at least say "hey just so you know, this package hasn't been downloaded much, are you sure you want this?" -- (I'm really interested in Roc being a language to learn to code with, so) maybe some naive advice during installation is all we need
and GitHub stars can easily be more of a measure of blog posts and HN exposure than quality/trustworthiness
Yeah, that is why I called them useless. They are proxies and can easily be propped up, but have some form of merrit
a concern about that is that it means attackers have an easy way to fake trustworthiness: automate downloads until the package has been downloaded a bunch before they launch their attack
And naturally it would create a stable equilibrium for popular packages, creating the de facto blessed problem again, because newer packages would find it hard to overcome this hill
Evan ended up with an unusual algorithm that's worked well in practice but which raises some obvious objections: Elm packages are ranked by the number of times the author has given a talk at a dedicated Elm conference - https://github.com/elm/package.elm-lang.org/blob/d4d5a997a5d9d6622694c488e6a3ae9f537da761/src/backend/Memory.hs#L282
Now I know what sits on the other end of the spectrum from 'decentralised' :D
yeah - personally I care more about "resilient" than "decentralized" - for example, one general category of designs that appeals to me is "there's a single default Roc package index, but you can configure your local client to switch to a different one, and since all the data in the index is publicly available, people could mirror it and recreate the whole ecosystem on short notice if necessary"
Yeah .. I fully accept that I'm being pie-in-the-sky with my decentralised ideas... glad to have had the discussion though.
To that end, in reply to the first message option 1/2: maybe users can be from multiple trusted vendors, e.g. github:~lawrencejob publishes github:@org-name/package-name
I think npm-style is the best. One off libraries can use a flat name like rand, a project can have many packages @namespace/package, and companies can group their open source/private stuff @myCompany/internal-code. I personally dislike the GitHub-username/repo format. I find seeing a bunch of usernames a waste characters
I definitely 100% would not default to pulling from GitHub. Put pulling from GitHub or some url directly should be an available option
https://hex.pm
this is the one for elixir
private orgs/packages cost money and that should be more than enough to pay for hosting
I was just about to ask- is the package manager expected to be sponsored? This could be a tonne of bandwidth to worry about...
this way companies can sign up with an org, and host private packages but since they aren't open source they gotta cough up some money :)
then maybe have the code for the package manager setup to be open source and easily self-hostable and mirror the main registry
Lucas Rosa said:
I think npm-style is the best. One off libraries can use a flat name like
rand, a project can have many packages@namespace/package, and companies can group their open source/private stuff@myCompany/internal-code. I personally dislike theGitHub-username/repoformat. I find seeing a bunch of usernames a waste characters
Totally agree with wasting characters with usernames - if this package manager has its own ID space, I'd suggest a way to differentiate orgs from users from namespaces (different preceding character?)
The advantage of username namespaces is that it permits forks of packages without new names, which would work really well with Elm-style semver constraints
if it's just a user package I would just not bother including it in the name in the deps file
I totally dislike the go and Deno style urls
Lawrence Job said:
Lucas Rosa said:
I think npm-style is the best. One off libraries can use a flat name like
rand, a project can have many packages@namespace/package, and companies can group their open source/private stuff@myCompany/internal-code. I personally dislike theGitHub-username/repoformat. I find seeing a bunch of usernames a waste charactersTotally agree with wasting characters with usernames - if this package manager has its own ID space, I'd suggest a way to differentiate orgs from users from namespaces (different preceding character?)
The advantage of username namespaces is that it permits forks of packages without new names, which would work really well with Elm-style semver constraints
I see what you mean, npm has the option for a user package to be flat or prefixed with a namespace (username on the registry)
this should accommodate forks and stuff
"hey this package name is taken, but you can use your namespace"
What about inverting it (optionally):
roc-math/lawrencejob@2.4.5
could be cool, not used it that so looks funny initially
:P
I went down a mental rabbit hole of 'what if someone forks that' and had to stop myself..
This being Roc (whitespace aware) maybe it can be cleaner... roc-math by lawrencejob at 2.4.5
app "echo"
packages { base: "platform", rand: "rand", thing: "user/thing", other: "org/other", rename: "org/something" }
Lawrence Job said:
This being Roc (whitespace aware) maybe it can be cleaner...
roc-math by lawrencejob at 2.4.5
hm kinda cute
what's cool about the packages record right now is how you can essentially rename what a package is referenced as for free
Feels alien to me, but I haven't tried it. Scares me that the same package identifier could mean something else in an adjacent file...
that should only work in an app header
the other adjacent files in a project would be interface which don't have a packages field
I copied the app header in the cli example for that and added some examples
Ah! I confused packages with imports! My bad!
all good, imports is a valid field here but I omitted it
Newbie question - do packages have constraints as to which platforms they're compatible with? Or does the app need to inject any platform-specific behaviours when invoking library packages?
yes and no
if the package doesn't bother importing stuff from a platform to abstract over or something then it should be platform agnostic
One off libraries can use a flat name like
rand
a downside of this is squatting: https://crates.io/users/swmon
you can create policies to transfer ownership, but then you have to decide what counts as "squatting" and enforce it on demand - which is a fraught policy to try to come up with, and enforcement is manual; if you don't have a paid Support team (which I think it's safe to assume we won't), who decides what to do in those cases?
this is why Rust crates don't attempt to prevent squatting, and why package name squatting is a huge and common complaint in the Rust ecosystem.
npm has a paid support team, but also the reason npm is now owned by Microsoft is that they couldn't keep the lights on with the revenue they were getting from private packages, given all their expenses :sweat_smile:
squatting doesn't hurt anyone so I'd go with what rust did
policing is tough work
prepare for an avalanche of complaints then :laughing:
fair
lmao
not to say there shouldn't be thought put into that, just my initial reaction
totally!
(Even before MS bought them, they were responsible for taking a big chunk out of NPM's revenue stream by having Azure based private NPM repositories for big companies anyway -- among many other companies)
if some squats rand, you can just do user/rand and move on
offering both is an option
rand by lawrencejob :wink:
or maybe default to registry-user/rand and give packages the ability to become single flat names after hitting a certain download number
That's a good idea! Although hits the blessed problem again
oh true
the "mandatory namespacing" design at least means if someone tries to squat on cool-web-server/cool-web-server I can make rtfeldman/cool-web-server and people can discover that mine has a better reputation
(although honestly I think it's appealing to disallow having the package name be the same as the namespace, to discourage :point_up: )
twitter check marks but for packages lol
ok I'm sold, registry-user/package or registry-org/package is the most sane and balanced
Is differentiation possible? ~registry-user/abc and @registry-org/xyz ?
in Elm in practice it's worked out that everybody knows when you say elm-ui you mean mdgriffith/elm-ui, when you say elm-css you mean rtfeldman/elm-css, when you say elm-charts you mean terezka/elm-charts etc
anything is possible, none of this exists yet!
I'm kinda surprised it's worked out so well, to be honest, but somehow it has! :big_smile:
squatting basically doesn't seem to happen
yea I think that makes sense, as long as it's not coupled to github
yeah I don't think we should couple to GH
Would make a very nice ide feature to autocomplete/search if that isn't planned already...
the best part might end up being writing the registry in roc itself, at first tho maybe in something else just to get the ball rolling idk
Andrews idea of failing on major version mismatches sounded reasonable as well
yeah that's what Elm does, and I like it :thumbs_up:
I'm gonna squat packages on myself rvcas/rand on day one
you'll never stop me
Two last thoughts before I stop philosophising:
Lucas Rosa said:
you'll never stop me
1 can be tricky tho
good thought to bring up
how does one transfer a package to someone else/org
if it's always in an org, then it's a matter of adding more users as contributors, right?
I see so:
can a user make a namespace with the same name as their username?
I don't see why not - they would be orthogonal ID domains
for what it's worth I think supporting sign ups via GitHub would be fine
oh.. until I create a rvcas namespace and capitalise on your reputation..
right
Lucas Rosa said:
for what it's worth I think supporting sign ups via GitHub would be fine
agreed, considering it's an openid provider, if GH were not trusted, could just keep adding more and more providers..
should names be unique across both users and orgs?
that would solve the problem you just mentioned
I'm not sure how big the problem would be.. afaik nobody's ever successfully stolen an identity on the much more permissive package managers
I wonder if there's a way to incentivize people to not do silly things instead of putting up fences
... at least I never heard about it because it became obvious...
what I've learned from blockchain is that if you make someone spend money they won't misbehave
I think that's it - eventually we just need to trust the community to be good people...
Lucas Rosa said:
what I've learned from blockchain is that if you make someone spend money they won't misbehave
I don't know what you mean - I bought the mona lisa for $20 - nobody scammed me!
what if holding a name that has no activity EVER or some period of time, then it costs money. I can see where maybe this will be an issue for "finished" projects that only take bug fixes
This is an interesting point to raise this issue I was bike-shedding on last night..
I honestly don't think it should be part of the language spec, but maybe it is a consideration for a package manager
https://github.com/rtfeldman/roc/issues/1862
I'm not advocating one way or the other btw, I'm just letting some thoughts flow for brainstorming
Lucas Rosa said:
what if holding a name that has no activity EVER or some period of time, then it costs money. I can see where maybe this will be an issue for "finished" projects that only take bug fixes
what if a project has no downloads for a certain time?
or like never been pushed to
I am doing the same - I certainly don't believe blockchain is the answer to a package manager in 2021
then it gets reclaimed
I think your point about pushing to 'finished' projects was valid - we can't use that
what do people think of the Maven approach to establishing identity? Summary:
rtfeldman.com - so maybe I publish a package like rtfeldman.com/roc-cliclever
Very fair... it's certainly asking a lot of a developer, though
I like the friction, means bad actors have more hoops to jump
yeah I don't mind friction in publishing packages, especially for the first time
If we're asking them to spend $10 on a domain, maybe they could just spend $10 to register a namespace :innocent:
and they even have to spend money on a domain to cause trouble, this disincentivizes even more
I like
like optimizing for "let's barf packages into the ecosystem as fast as possible" is not optimal imo
exactly
Lawrence Job said:
If we're asking them to spend $10 on a domain, maybe they could just spend $10 to register a namespace :innocent:
not a bad point
that's interesting! I never thought of that :thinking:
although there is perceived image there and someone can use their domain for other stuff so it's more value to you using the domain route
I have mixed feelings about it, but it's definitely intereting
I meant it in jest but it might actually be a good way to fund the repo
private packages and namespaces
Another thought - back to pie in the sky I'm afraid, but worth considering:
All repos so far allow orgs and people to register with any username... nobody has ever required a person to verify their identity (e.g. when you're asked show a passport to your webcam when you sign up to coinbase). Is this because it's the right thing to do, or because the tech wasn't there?
could even have it plan based. 5 bucks 5 namespaces, 10 bucks 5 namespaces and 5 private packages, 20 bucks unlimited
:shrug:
no charge on package count
Reason I mention this is because what if anyone can upload to the repo, but those who verify their identity get a twitter-style-tick and their contributions are considered more trusted?
yea all that just depends how involved we would want to be
There are third parties who can do all that gross stuff for us these days (taking care of the privacy caveats etc)
yeah speaking of friction, my thinking on how to make the repo super low cost is to make the contents of the packages hosted somewhere else - e.g. when I go to publish my package, I have to provide a URL (which in practice will presumably almost always be a GitHub Release, but doesn't have to be) that end users will download it from.
on publish, the index writes down the hash of the contents of the package, and sends that to the client - so if someone takes over that URL (whatever it was) and tries to change the contents, it'll just fail to install.
we can also back up all the contents of all the packages (which is way cheaper than serving them!), so if any of them ever starts 404ing (like left-pad), we have all the data necessary to restore them to a different URL on short notice. Ideally, we could get community volunteers (or companies) to offer to run mirror networks to be used as fallbacks in case that happens, and the clients would also validate the mirrors against the hashes in the index, to prevent malicious mirrors
cool
the hash being stored in the index (and the contents being backed up, and ideally mirrored) makes it different from the Deno approach
GitHub action to, on release, send the version and hash to the repo?
sure!
yea I was about to say how much do we want to distribute this system
would be cool if volunteers could spin up a mirror with a few commands to help the network
yeah totally!
Sorry wrote that before your messages popped up..
roc volunteer :big_smile:
even provide config files, scripts, terraform, and k8 setups so people can host it in a bunch of ways
(whispers blockchain)
Richard Feldman said:
roc volunteer:big_smile:
woah, that's kinda cool
If the index is just (namespace, name, version, hash) - how big might that be?
Lawrence Job said:
(whispers blockchain)
I've definitely thought about it tbh, I just can't come up with a good economic model for it
Roccoin :D
This is how it all goes downhill
I think there are projects related to this out there
I've never checked but I think I remember a "git"coin kinda thing
there's also radicle
I tried it out, it's kinda neat
100,000 packages with 100 stored versions and a cost of (40+40+20+40)bytes = 1.4GB + overhead, so Roc package manager nodes would be feasible
(and it's plenty small enough for a blockchain but I know I'm going to get booted from the server if I bring it up again)
why I don't hate blockchain
it's contentious but we also aren't shilling coins, just thinking about package management and auth
I'm not saying that it's a good thing, but it would get headlines
oh yea for sure
it doesn't have to be a widely distributed ledger either - it could happily run on 10/100s of volunteer nodes if that's a model we're exploring
true
I wonder what radicle does to work
I didn't have to pay or use coins to use it
As far as I can tell, the code ledger is only distributed across systems who care about it, so you're not using any compute resource other than your own, but I'm not an expert.
Is it possible to build a centralised package manager for v1 with the notion of and pathway to decentralising it into a ledger for 'volunteers'? Would require some clever design at the beginning
very cool
One point about leaving the artefacts on github/providers. Assuming we don't have any issues of trust, etc, would the providers be ok with all of the hammering connections they'd get from the package manager client? Is that something other package managers already do/have solved?
GitHub releases are basically designed to be used for this sort of thing, so seems fine
we also asked someone at GitHub in the early Elm days if they'd be ok with it and they said "yeah we don't care as long as you aren't cloning the entire repo, just getting one commit"
which is what homebrew does, for example
swift pulls from GitHub too I think
so my thinking is basically "how can we use GitHub for hosting in practice without being outright coupled to it"
Seems like
In the case of D) that means the roc package manager can be completely naive and take signed hashes and URLs
a challenge we've seen with Elm of "GitHub for ID" is that people change their GitHub usernames sometimes :sweat_smile:
it's caused problems in practice
They of course have underlying integer IDs - when you change a username, does it free up the old username? Is it enough for the PM to just use whatever ID was connected to their account to start with (if the final design does indeed use usernames)?
either way, the problem is that there's all these packages out there referring to oldname/foo
by that name in their code
so the package manager would have to at a minimum detect when that happens and set up redirect rules
and apparently you can indeed resurrect old GitHub usernames: https://www.theregister.com/2018/02/10/github_account_name_reuse/
It sounds like the 'correct' solution is to accept that there is no uniqueness (only the illusion of) constraint in the GitHub username domain and establish a new one:
a) either a new one that is orthogonal or
b) a new one that bears uncanny resemblance and there's ongoing effort to maintain a link (redirects, etc)
github underlying ID for identity verification <- 1:1-> (new, unique username domain) <- 1:m -> package
yeah I'm surprised how appealing the Maven design seems, all things considered
yea same
my identity is a top-level domain I own, e.g. rtfeldman.com - so maybe I publish a package like rtfeldman.com/roc-cli
I think the hurdle to publich might be too hard. Say if my company (humio.com for instance) wants to open-source something, then I need to ask the people handling the marketing website to add/serve (and maintain) a public key.
Also, I don't know why, but I feel like you'll be more likely to have fake accounts like humlo.com/xyz with "malicious" packages :sweat_smile:
I dunno, Maven has the second highest number of packages of any package repo (only npm has more) so although that may be a significant hurdle in some cases, I have a hard time concluding that Maven's auth/identity design is keeping it from realizing its potential! :big_smile:
typosquatting is definitely a problem for every package system past a certain size, so we'll need to think about how to mitigate that regardless I think
I'm not sure whether this might be a different topic: I think having a strong default package system is really important. In addition, having a way to easily import packages from arbitrary sources (file system or URL) would allow for great flexibility.
So having good defaults that work most of the time would be enough, and the package system could rely on those alternatives for the cases where the defaults don't work as well.
I really like the idea to have third parties be able to sign releases. Especially for platforms. Like you said, the platforms are kind of a trust thing. Having a big/trusted company (if roc is able to get them to use roc) say: yes, we analysed that package and I seems save to use would be nice. Or you could pay for an audit and have that reflected directly in the package.
If you don't allow to delete a package but don't host it yourself, what is the purpose? I mean if the package gets deleted, you only have a dead link.
On the other side, if a create signs a package you could have mirrors for packages. And if someone decides to delete the original hosted version a mirror could seamlessly take over. The only problem is to manage the key infrastructure. But if you want to be able to sign packages, you need something similar nonetheless?
Not sure I am a fan of the domain owner solution. This adds friction and I am not sure this would stop an attacker. Attacking the ecosystem would be much more expensive than a few euro for a domain. Maven is an established ecosystem, roc has to get there first. And I am not sure that it would add anything that a key signing solution wouldn't add. Isn't there also a problem with utf-8 domains? In this case, you'd also force a package creator to take care of that. (or check when adding to the package repository)
Another aspect to the naming: If you only have 'simple' names, you can't distinguish between official packages and third party ones based on the name.
And a +1 for e.g. direct git urls.
Christian Dereck said:
I really like the idea to have third parties be able to sign releases.
Thanks for saying - I thought it went down like a lead balloon, but I think would be transformative for package managers. I like the idea of an audit/accreditation system growing naturally, too..
If you don't allow to delete a package but don't host it yourself, what is the purpose? I mean if the package gets deleted, you only have a dead link.
Ahhggg good point. Does this rule out getting away without hosting the files? I'm starting to think it might. The mirror point is interesting - could it be used as a fallback for if the original package disappears? Seems like the package manager's going to need a sponsor with deep pockets if the language takes off if preventing deletion is a required feature.
Dart has a concept of verified publishers (where you verify that you own a domain) that is orthogonal to the naming of the package. To publish a package, all you have to do is sign in with a google account, but you can at any time switch to a verified publisher by linking your domain name. (there is the theoretical problem of name squatting, but since you publish with your gmail / domain name you'd probably be looked down in the community for doing so, and the dart team handles taking down malicious or bad packages when they are reported, which is pretty infrequent). When using a verified publisher you can authorize multiple people to upload the package (and the package page displays the email addressess of the uploaders). Verified packages don't change the package search or give preference, but does add a badge after the name of the package. Instead packages can get points for following formatting / style conventions, providing documentation, supporting multiple platforms, passing static analysis, having up to date dependencies, and supporting null safety. These contribute to a score that factors into the search. Additionally it has Likes / Popularity measures, that factor in somewhat, but not as much and end up remapped from a range of [0-1] to [0.5-1] to account for new packages, and I've found that the popularity measures / likes are useful to differentiate between similarly named packages that are trying to gain from the popularity of the original package. A lot of the weight of the search is based on a fuzzy match in the readme / package name. They do have the advantage that hosting the packages and the package documentation is supported by Google.
It also has a mechanism to make the package manager use a different package server, and the package server is open-sourced so anyone can host their own package server which could either proxy to the normal server, or whatever you want.
The search ranking: https://github.com/dart-lang/pub-dev/blob/master/doc/search.md
Quality metrics: https://pub.dev/help/scoring
And publisher / publishing: https://dart.dev/tools/pub/publishing
Thought it might be useful to have another perspective, since many of you are probably not as familiar with dart.
cool idea
Last updated: Jun 16 2026 at 16:19 UTC