代码加载

Note

定义

Julia加载代码有两种机制：

1. 代码包含：例如 include("source.jl")。包含允许你把一个程序拆分为多个源文件。表达式 include("source.jl") 使得文件 source.jl 的内容在出现 include 调用的模块的全局作用域中执行。如果多次调用 include("source.jl")source.jl 就被执行多次。source.jl 的包含路径解释为相对于出现 include 调用的文件路径。重定位源文件子树因此变得简单。在 REPL 中，包含路径为当前工作目录，即 pwd()
2. 加载包：例如 import Xusing Ximport 通过加载包（一个独立的，可重用的 Julia 代码集合，包含在一个模块中），并导入模块内部的名称 X，使得模块 X 可用。 如果在同一个 Julia 会话中，多次导入包 X，那么后续导入模块为第一次导入模块的引用。但请注意，import X 可以在不同的上下文中加载不同的包：X 可以引用主工程中名为 X 的一个包，但它在各个依赖中可以引用不同的、名称同为 X 的包。更多机制说明如下。

包的联合

Julia 支持联合的包管理，这意味着多个独立的部分可以维护公有包、私有包以及包的注册表，并且项目可以依赖于一系列来自不同注册表的公有包和私有包。您也可以使用一组通用工具和工作流（workflow）来安装和管理来自各种注册表的包。Julia 附带的 Pkg 软件包管理器允许安装和管理项目的依赖项，它会帮助创建并操作项目文件（其描述了项目所依赖的其他项目）和清单文件（其为项目完整依赖库的确切版本的快照）。

环境（Environments）

1. 项目环境（project environment）是包含项目文件和清单文件（可选）的目录，并形成一个显式环境。项目文件确定项目的直接依赖项的名称和标识。清单文件（如果存在）提供完整的依赖关系图，包括所有直接和间接依赖关系，每个依赖的确切版本以及定位和加载正确版本的足够信息。
2. 包目录（package directory）是包含一组包的源码树子目录的目录，并形成一个隐式环境。如果 X 是包目录的子目录并且存在 X/src/X.jl，那么程序包 X 在包目录环境中可用，而 X/src/X.jl 是加载它使用的源文件。

• 项目环境提供可迁移性。通过将项目环境以及项目源代码的其余部分存放到版本控制（例如一个 git 存储库），您可以重现项目的确切状态和所有依赖项。特别是，清单文件会记录每个依赖项的确切版本，而依赖项由其源码树的加密哈希值标识；这使得 Pkg 可以检索出正确的版本，并确保你正在运行准确的已记录的所有依赖项的代码。
• 当不需要完全仔细跟踪的项目环境时，包目录更方便。当你想要把一组包放在某处，并且希望能够直接使用它们而不必为之创建项目环境时，包目录是很实用的。
• 堆栈环境允许向基本环境添加工具。您可以将包含开发工具在内的环境堆到堆栈环境的末尾，使它们在 REPL 和脚本中可用，但在包内部不可用。

• roots: name::Symboluuid::UUID

环境的 roots 映射将包名称分配给UUID，以获取环境可用于主项目的所有顶级依赖项（即可以在 Main 中加载的那些依赖项）。当 Julia 在主项目中遇到 import X 时，它会将 X 的标识作为 roots[:X]

• graph: context::UUIDname::Symboluuid::UUID

环境的 graph 是一个多级映射，它为每个 context UUID 分配一个从名称到 UUID 的映射——类似于 roots 映射，但专一于那个 context。当 Julia 在 UUID 为 context 的包代码中运行到 import X 时，它会将 X 的标识看作为 graph[context][:X]。正是因为如此，import X 可以根据 context 引用不同的包。

• paths: uuid::UUID × name::Symbolpath::String

paths 映射会为每个包分配 UUID-name 对，即该包的入口点源文件的位置。在 import X 中，X 的标识已经通过 roots 或 graph 解析为 UUID（取决于它是从主项目还是从依赖项加载），Julia 确定要加载哪个文件来获取 X 是通过在环境中查找 paths[uuid,:X]。要包含此文件应该定义一个名为 X 的模块。一旦加载了此包，任何解析为相同的 uuid 的后续导入只会创建一个到同一个已加载的包模块的绑定。

Note

项目环境（Project environments）

roots 映射 在环境中由其项目文件的内容决定，特别是它的顶级 nameuuid 条目及其 [deps] 部分（全部是可选的）。考虑以下一个假想的应用程序 App 的示例项目文件，如先前所述：

name = "App"
uuid = "8f986787-14fe-4607-ba5d-fbff2944afa9"

[deps]
Pub  = "c07ecb7d-0dc9-4db7-8803-fadaaeaf08e1"

roots = Dict(
:App  => UUID("8f986787-14fe-4607-ba5d-fbff2944afa9"),
)

[[Priv]] # 私有的那个
deps = ["Pub", "Zebra"]
path = "deps/Priv"

[[Priv]] # 公共的那个
uuid = "2d15fe94-a1f7-436c-a4d8-07a9a496e01c"
git-tree-sha1 = "1bf63d3be994fe83456a03b874b409cfd59a6373"
version = "0.1.5"

[[Pub]]
git-tree-sha1 = "9ebd50e2b0dd1e110e842df3b433cb5869b0dd38"
version = "2.1.4"

[Pub.deps]
Priv = "2d15fe94-a1f7-436c-a4d8-07a9a496e01c"
Zebra = "f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"

[[Zebra]]
uuid = "f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"
git-tree-sha1 = "e808e36a5d7173974b90a15a353b564f3494092f"
version = "3.4.2"

• 应用程序使用两个名为 Priv 的不同包，一个作为根依赖项的私有包，以及一个通过 Pub 作为间接依赖项的公共包。它们通过不同 UUID 来区分，并且有不同的依赖项：
• 私有的 Priv 依赖于 PubZebra 包。
• 公有的 Priv 没有依赖关系。
• 该应用程序还依赖于 Pub 包，而后者依赖于公有的 Priv 以及私有的 Priv 包所依赖的那个 Zebra 包。

graph = Dict(
# Priv——私有的那个:
:Zebra => UUID("f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"),
),
# Priv——公共的那个:
UUID("2d15fe94-a1f7-436c-a4d8-07a9a496e01c") => Dict(),
# Pub:
:Priv  => UUID("2d15fe94-a1f7-436c-a4d8-07a9a496e01c"),
:Zebra => UUID("f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"),
),
# Zebra:
UUID("f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62") => Dict(),
)

graph[UUID("c07ecb7d-0dc9-4db7-8803-fadaaeaf08e1")][:Priv]

1. 如果目录中的项目文件与要求的 uuid 以及名称 X 匹配，那么可能出现以下情况的一种：
• 若该文件具有顶层 路径 入口，则 uuid 会被映射到该路径，文件的执行与包含项目文件的目录相关。
• 此外，uuid 依照包含项目文件的目录，映射至与src/X.jl
1. 若非上述情况，且项目文件具有对应的清单文件，且该清单文件包含匹配 uuid 的节（stanza），那么：
• 若其具有一个 路径 入口，则使用该路径（与包含清单文件的目录相关）。
• 若其具有一个 git-tree-sha1 入口，计算一个确定的 uuidgit-tree-sha1 函数——我们把这个函数称为 slug——并在每个 Julia DEPOT_PATH 的全局序列中的目录查询名为 packages/X/\$slug 的目录。使用存在的第一个此类目录。

If, on the other hand, Julia was loading the other Priv package—the one with UUID 2d15fe94-a1f7-436c-a4d8-07a9a496e01c—it finds its stanza in the manifest, see that it does not have a path entry, but that it does have a git-tree-sha1 entry. It then computes the slug for this UUID/SHA-1 pair, which is HDkrT (the exact details of this computation aren't important, but it is consistent and deterministic). This means that the path to this Priv package will be packages/Priv/HDkrT/src/Priv.jl in one of the package depots. Suppose the contents of DEPOT_PATH is ["/home/me/.julia", "/usr/local/julia"], then Julia will look at the following paths to see if they exist:

1. /home/me/.julia/packages/Priv/HDkrT
2. /usr/local/julia/packages/Priv/HDkrT

Julia uses the first of these that exists to try to load the public Priv package from the file packages/Priv/HDKrT/src/Priv.jl in the depot where it was found.

Here is a representation of a possible paths map for our example App project environment, as provided in the Manifest given above for the dependency graph, after searching the local file system:

paths = Dict(
# Priv – the private one:
# relative entry-point inside App repo:
"/home/me/projects/App/deps/Priv/src/Priv.jl",
# Priv – the public one:
(UUID("2d15fe94-a1f7-436c-a4d8-07a9a496e01c"), :Priv) =>
# package installed in the system depot:
"/usr/local/julia/packages/Priv/HDkr/src/Priv.jl",
# Pub:
# package installed in the user depot:
"/home/me/.julia/packages/Pub/oKpw/src/Pub.jl",
# Zebra:
(UUID("f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"), :Zebra) =>
# package installed in the system depot:
"/usr/local/julia/packages/Zebra/me9k/src/Zebra.jl",
)

This example map includes three different kinds of package locations (the first and third are part of the default load path):

1. The private Priv package is "vendored" inside the App repository.
2. The public Priv and Zebra packages are in the system depot, where packages installed and managed by the system administrator live. These are available to all users on the system.
3. The Pub package is in the user depot, where packages installed by the user live. These are only available to the user who installed them.

包目录

• X.jl
• X/src/X.jl
• X.jl/src/X.jl

Which dependencies a package in a package directory can import depends on whether the package contains a project file:

• If it has a project file, it can only import those packages which are identified in the [deps] section of the project file.
• If it does not have a project file, it can import any top-level package—i.e. the same packages that can be loaded in Main or the REPL.

The roots map is determined by examining the contents of the package directory to generate a list of all packages that exist. Additionally, a UUID will be assigned to each entry as follows: For a given package found inside the folder X...

1. If X/Project.toml exists and has a uuid entry, then uuid is that value.
2. If X/Project.toml exists and but does not have a top-level UUID entry, uuid is a dummy UUID generated by hashing the canonical (real) path to X/Project.toml.
3. Otherwise (if Project.toml does not exist), then uuid is the all-zero nil UUID.

The dependency graph of a project directory is determined by the presence and contents of project files in the subdirectory of each package. The rules are:

• If a package subdirectory has no project file, then it is omitted from graph and import statements in its code are treated as top-level, the same as the main project and REPL.
• If a package subdirectory has a project file, then the graph entry for its UUID is the [deps] map of the project file, which is considered to be empty if the section is absent.

As an example, suppose a package directory has the following structure and content:

Aardvark/
src/Aardvark.jl:
import Bobcat
import Cobra

Bobcat/
Project.toml:
[deps]
Cobra = "4725e24d-f727-424b-bca0-c4307a3456fa"
Dingo = "7a7925be-828c-4418-bbeb-bac8dfc843bc"

src/Bobcat.jl:
import Cobra
import Dingo

Cobra/
Project.toml:
uuid = "4725e24d-f727-424b-bca0-c4307a3456fa"
[deps]
Dingo = "7a7925be-828c-4418-bbeb-bac8dfc843bc"

src/Cobra.jl:
import Dingo

Dingo/
Project.toml:
uuid = "7a7925be-828c-4418-bbeb-bac8dfc843bc"

src/Dingo.jl:
# no imports

Here is a corresponding roots structure, represented as a dictionary:

roots = Dict(
:Aardvark => UUID("00000000-0000-0000-0000-000000000000"), # no project file, nil UUID
:Cobra    => UUID("4725e24d-f727-424b-bca0-c4307a3456fa"), # UUID from project file
:Dingo    => UUID("7a7925be-828c-4418-bbeb-bac8dfc843bc"), # UUID from project file
)

Here is the corresponding graph structure, represented as a dictionary:

graph = Dict(
# Bobcat:
:Cobra => UUID("4725e24d-f727-424b-bca0-c4307a3456fa"),
:Dingo => UUID("7a7925be-828c-4418-bbeb-bac8dfc843bc"),
),
# Cobra:
UUID("4725e24d-f727-424b-bca0-c4307a3456fa") => Dict(
:Dingo => UUID("7a7925be-828c-4418-bbeb-bac8dfc843bc"),
),
# Dingo:
UUID("7a7925be-828c-4418-bbeb-bac8dfc843bc") => Dict(),
)

A few general rules to note:

1. A package without a project file can depend on any top-level dependency, and since every package in a package directory is available at the top-level, it can import all packages in the environment.
2. A package with a project file cannot depend on one without a project file since packages with project files can only load packages in graph and packages without project files do not appear in graph.
3. A package with a project file but no explicit UUID can only be depended on by packages without project files since dummy UUIDs assigned to these packages are strictly internal.

Observe the following specific instances of these rules in our example:

• Aardvark can import on any of Bobcat, Cobra or Dingo; it does import Bobcat and Cobra.
• Bobcat can and does import both Cobra and Dingo, which both have project files with UUIDs and are declared as dependencies in Bobcat's [deps] section.
• Bobcat cannot depend on Aardvark since Aardvark does not have a project file.
• Cobra can and does import Dingo, which has a project file and UUID, and is declared as a dependency in Cobra's [deps] section.
• Cobra cannot depend on Aardvark or Bobcat since neither have real UUIDs.
• Dingo cannot import anything because it has a project file without a [deps] section.

The paths map in a package directory is simple: it maps subdirectory names to their corresponding entry-point paths. In other words, if the path to our example project directory is /home/me/animals then the paths map could be represented by this dictionary:

paths = Dict(
(UUID("00000000-0000-0000-0000-000000000000"), :Aardvark) =>
"/home/me/AnimalPackages/Aardvark/src/Aardvark.jl",
"/home/me/AnimalPackages/Bobcat/src/Bobcat.jl",
(UUID("4725e24d-f727-424b-bca0-c4307a3456fa"), :Cobra) =>
"/home/me/AnimalPackages/Cobra/src/Cobra.jl",
(UUID("7a7925be-828c-4418-bbeb-bac8dfc843bc"), :Dingo) =>
"/home/me/AnimalPackages/Dingo/src/Dingo.jl",
)

Since all packages in a package directory environment are, by definition, subdirectories with the expected entry-point files, their paths map entries always have this form.

Environment stacks

The third and final kind of environment is one that combines other environments by overlaying several of them, making the packages in each available in a single composite environment. These composite environments are called environment stacks. The Julia LOAD_PATH global defines an environment stack—the environment in which the Julia process operates. If you want your Julia process to have access only to the packages in one project or package directory, make it the only entry in LOAD_PATH. It is often quite useful, however, to have access to some of your favorite tools—standard libraries, profilers, debuggers, personal utilities, etc.—even if they are not dependencies of the project you're working on. By adding an environment containing these tools to the load path, you immediately have access to them in top-level code without needing to add them to your project.

The mechanism for combining the roots, graph and paths data structures of the components of an environment stack is simple: they are merged as dictionaries, favoring earlier entries over later ones in the case of key collisions. In other words, if we have stack = [env₁, env₂, …] then we have:

roots = reduce(merge, reverse([roots₁, roots₂, …]))
graph = reduce(merge, reverse([graph₁, graph₂, …]))
paths = reduce(merge, reverse([paths₁, paths₂, …]))

The subscripted rootsᵢ, graphᵢ and pathsᵢ variables correspond to the subscripted environments, envᵢ, contained in stack. The reverse is present because merge favors the last argument rather than first when there are collisions between keys in its argument dictionaries. There are a couple of noteworthy features of this design:

1. The primary environment—i.e. the first environment in a stack—is faithfully embedded in a stacked environment. The full dependency graph of the first environment in a stack is guaranteed to be included intact in the stacked environment including the same versions of all dependencies.
2. Packages in non-primary environments can end up using incompatible versions of their dependencies even if their own environments are entirely compatible. This can happen when one of their dependencies is shadowed by a version in an earlier environment in the stack (either by graph or path, or both).

Since the primary environment is typically the environment of a project you're working on, while environments later in the stack contain additional tools, this is the right trade-off: it's better to break your development tools but keep the project working. When such incompatibilities occur, you'll typically want to upgrade your dev tools to versions that are compatible with the main project.

总结

Federated package management and precise software reproducibility are difficult but worthy goals in a package system. In combination, these goals lead to a more complex package loading mechanism than most dynamic languages have, but it also yields scalability and reproducibility that is more commonly associated with static languages. Typically, Julia users should be able to use the built-in package manager to manage their projects without needing a precise understanding of these interactions. A call to Pkg.add("X") will add to the appropriate project and manifest files, selected via Pkg.activate("Y"), so that a future call to import X will load X` without further thought.