RFC: marking private and public APIs #48819
Replies: 10 comments 9 replies
-
I don't think preventing people from accessing a given function entirely via some form of
I also don't think such an
In regards to smaller package images - in an AOT setting, where a static binary without compilation is the goal, you don't need an additional layer of I don't see how I really like having an As a point of reference (but please don't feel the need to post), there's also this lengthy discourse thread about this very topic, with lots of good arguments for & against various versions of this (including my quite strong stance about why I personally think a |
Beta Was this translation helpful? Give feedback.
-
@Seelengrab I probably should emphasize on the improvement of code loading times more, for the API definitions part I think it should be in the scope of #42117, I think the comments on
Yes, that's exactly what I'm trying to propose here actually, a more "static" marker, the proposed
I don't have an example right now, since I thought this was obvious. But I could be wrong on this. But let me explain two things here:
module A
# package A
using B: foo
end # module A
But A.eval(quote
using B: goo
goo
end) and by calling the loader at runtime throw Basically, here we are treating This is basically why I'm saying if we want this in 1.x, we will need something like a
I don't think you have access to all the internals even for the current module loading system? if the only module you have access to above is And in my proposal, if the loading statement is written as @Seelengrab also maybe you didn't see the example I provided for the MLStyle case above, which is an extreme case because it doesn't have any dependencies, if we make MLStyle inaccessible from outside, we can safely not load MLStyle which will have an improvement on loading (the ExproniconLite package, which I created using a hacky way). This is a bit hard to demonstrate using a macro, so I'm not able to provide something more complicated, but I hope this demonstrates the idea. |
Beta Was this translation helpful? Give feedback.
-
I guess extensions are not exactly the same as this - extensions are lazily loaded, but here it's about reducing downstream package loading time by reducing unnecessary cache. The effect is similar - much faster loading time, but both should be used in practice to reduce loading time further. On the other hand, I'd doubt if converting say every ODE solver into an extension makes sense or practical... mark things inaccessible in downstream packages could be easier to adapt. |
Beta Was this translation helpful? Give feedback.
-
That is not true: julia> module B
foo() = "hi"
bar() = "bar"
end
Main.B
julia> module A
# package A
using ..B: foo
end # module A
Main.A
julia> (A.foo |> parentmodule).bar()
"bar" So unless we break the notion of accessibility of unexported names, you can currently get everything in a dependency via reflection on anything that module gave us. If you want to get this into a 1.x release, breaking here does not look like an option because there's no safe state to fall back to in general here - we don't actually track the API surface of anything in a programmatic way right now, and "unexported" is not an accurate proxy variable here. To me that means the only option left is to actually make it queryable what is API, by placing explicit
Yes, but that is not sufficient. What is accessible vs. what is API are orthogonal concepts, and the current (unspoken & unqueryable) convention is that unexported names are (usually) not API, and documented names are (usually) API. It's this fuzziness around what exactly constitutes the API of a package that is the problem, not whether you can access the given name at all. Regardless of solving that, using any sort of API marker as a boundary for tree shaking is bound to lead to problems because that's not the job of an API marker - their job is to inform a developer/user about what is considered safe/stable to access across versions, and accessing anything else does not have any guarantees associated with it.
To be very clear here - with "extension" I was not referring to the weak dependencies/package extensions we now have on master, but referring to the practice of extending the interface of a library in java with a new type/object. Unless that library has all of its boundary marked as |
Beta Was this translation helpful? Give feedback.
-
I didn't see the
Yeah, or that, I'm open to anything that marks an API, a publicly accessible name, or whatever we name it. But I'm just saying this can be helpful in reducing loading time if we make things inaccessible.
OK, you get me confused now, are you proposing we treat the |
Beta Was this translation helpful? Give feedback.
-
I'm against hard barriers to access. Hacking "private" methods is honestly a very useful feature of Julia. If you want there to a little extra friction for access to "private" methods, that would be a nice addition: make the consuming code self-document that it's using a "private" method. There would have to be a very compelling reason to strip it entirely, IMO. Which leaves...
Which I don't know enough about to comment on the merits. |
Beta Was this translation helpful? Give feedback.
-
I kinda regret I put this as my first point now... so basically, if we want to delete these methods later, we need a way to prevent downstream users from accessing them, otherwise, we cannot delete them. I think this makes a lot of sense for an AOT language, but it's tricky for a JIT language... as people's expectation is different I kinda understand why you are against hard barriers to access as a developer, I think if we can trade being able to hack private methods with faster loading time I'd choose later, but let me change my post a bit to be more clear on this... |
Beta Was this translation helpful? Give feedback.
-
To be clear, I think the claim that a public/private or accessible/inaccessible distinction can improve load times is dubious - at best, you can delete code that isn't API and isn't used by the package internally, at which point, why is that code in the package in the first place? The load times are not a result of functions that noone calls after all, but of all those that are called (one way or another). |
Beta Was this translation helpful? Give feedback.
-
From my comment on Slack: a symbol followed by a question mark without a space A motivation for this would be if we want to avoid the association with OOP-style lookup, making people think they could actually write a function call with that form. |
Beta Was this translation helpful? Give feedback.
-
Closing now that we have |
Beta Was this translation helpful? Give feedback.
-
This is based on #42117 but is a bit different proposal that might be breaking for Julia 1.x based on different considerations thus I decided to open a separate issue to make reading a bit easier.
In short, I'd like a similar macro/keyword that marks an object:
<module>.<name>
or<name>
after explicitly imported byimport/using
private
, which means one cannot access them at all from outsidethe syntax may vary due to compatibility concerns, e.g
Keep things non-breaking
because we currently do not hide things by default, we will need a marker for things we want to make private
Breaking but simpler
if we choose something more breaking, it could be a
public
/export
keyword combined with #39235, whereexport
/public
could just mean "public" accessible APIs, and things are not accessible from the outside module directly.On the other hand,
using XXX
andimport XXX
needs to be private by default, unless marked withexport
/public
so we can prevent someone using a long module path to access some deep dependencies from a package (see the point 2 in Why).Additionally
I'd like a macro that marks certain API's stability status, this is something I find quite nice from the rust community, where they have things like
It marks certain things' stability at the same place where it's defined programmatically so that a linter can warn users based on their current toolchain version.
In Julia, we only have a poor docstring saying "use at your own risk", which is something I think could be improved by this. Having this shouldn't break, it could be one macro marks struct fields and function arguments about their availability and stability. This could make the experimental feature easier to provide and play with in downstream.
Why?
#42117 has overlapped with this proposal thus the reasons @DilumAluthge listed also apply here, I'd like to provide a few other motivations tho
A.B.C.<a private function>
. I think quite a few AOT languages have a similar mechanism to mark things private so they can be tree-shake away. I'm not an expert on package cache or system images, but I think this might be one of the low-hanging fruit that can be improved by changing the semantics a bit, so please correct me if this is not what can be improved in Julia's case.A demonstration of this can be resolving the issue that https://expronicon.rogerluo.dev/intro/bootstrap and https://github.com/thautwarm/DevOnly.jl trying to solve automatically:
MLStyle is an extreme example of this, what
@match
generates only depends on Base, but the only reason why downstream packages will still load MLStyle is only that users are allowed to access MLStyle's@match
viaAAA.BBB.CCC.@match
if CCC contains using MLStyle:@match
, and one can get a rough estimation in this extreme case on loading time improvementshere you can see even Expronicon depends on MLStyle, by removing MLStyle from loading entirely we can get twice faster loading time than depending on MLStyle.
Base
, we currently have a very vague way of distinguishing them by whether the function has a docstring or not, IMO even functions only made for developers deserves a docstring in the dev docs. If we have a mark to distinguish such APIs, then the reference page can be generated automatically for manual and dev docs of the corresponding functions. And APIs likeBase.print_matrix
etc. can be more clear to people that whether they should use and whether this is maybe broken in future versions.But I'd like to mention one real-world example which is the usage of DiffEq, most of the time one only uses one ODE solver from that giant package, thus in principle downstream package should not be loading the whole thing, but that ODE solver code only. But because currently you are allowed to access other solvers by something like
MyPackage.DiffEq.OrdinaryDiffEq.Vern8
we will have to load the whole thing which is super slow.Ideally, the compiler should cache the corresponding solver code into the downstream package image and only load that piece when
using MyPackage
without explicitusing DiffEq
. But I think this is not allowed because users can technically doMyPackage.DiffEq
to access anything insideDiffEq
Beta Was this translation helpful? Give feedback.
All reactions