I spent a week writing DokuWiki::ResolveName, a module that lets you resolve the pagename of link targets in DokuWiki.

Table of contents

Because there’s a lot, feel free to skip the section if you’re not interested in it.

How to use the module

Say you have a page at minecraft:start with the following contents:

====== Minecraft ======

Learn more about [[carrot|Carrots]]. Or go back to the [[playground:]]

To know which pages the links go to, you can use the only exported function by the module, resolve-name( $current-page, $link) as such:

use DokuWiki::ResolveName;

resolve-name 'minecraft:start', 'carrot';
# RESULT: ':minecraft:carrot'

You can also pass optional named arguments, such as a list of existing pages. This is useful when resolving namespace links (those that end in a colon), because they depend on which pages already exist in the wiki.

resolve-name 'minecraft:start', 'playground:',
    pages => ('start', 'playground:playground');
# RESULT: ':playground:playground'

In this case, we’re resolving playground:. Since the page playground:playground — a page with the same name as the namespace — exists, the link points to that. If we didn’t supply a list of pages, it will point by default to playground:start.

What I’ve learned

Resolving circular dependencies with roles

When developping the code, I had a circular dependency between the PageName class and the App class. The App class had methods to manage existing pages, so it required PageName; and the PageName class had an attribute that stored the current App object, which is used for resolving links. This can be summed up with the following graph.

[App <-> Config]

As you can see, there’s a loop. An arrow from App to Config means that App depends on Config. There may be another way of solving this without using roles, but I couldn’t think of one. The Raku FAQ is surprisingly helpful by outright telling you how to fix the problem:

Very likely you can accomplish what you are trying to do using roles. Instead of A.pm6 depending on B.pm6 and B.pm6 depending on A.pm6, you can have A-Role.pm6 and B-Role.pm6 and classes in A.pm6 and B.pm6 implementing these roles respectively. Then you can depend on A-Role.pm6 and B-Role.pm6 without the need for the circular dependency.

From https://docs.raku.org/language/faq#Can_I_have_circular_dependencies_between_modules?

Thus, I implemented that solution, and indeed, the dependency graph had no loops anymore.

[App -> ConfigLike; Config -> AppLike; ConfigLike --> Config; AppLike --> App]

A dashed arrow from App to AppLike means that App implements the AppLike role. As you can see, there are no more loops. In practice, this means that each class required the interface of the other class. When we create a class object, we’ll have to create each object and link them together to recreate the dependency loop.

Role composition and minimal interfaces

At some point in the development, I wanted functions to only take the minimum configuration object needed. Since the DokuWiki’s configuration has at least 30 options, I felt that it would be wrong to pass that data struct around. In retrospect, it doesn’t really matter.

Forcing functional when there’s an omnipresent state doesn’t work

My first idea was to go fully functional, and have only the minimal configuration settings be passed as named arguments. This lead to a lot of passing around named arguments.

sub colon-normalize ($string, :$sepchar) { ... }
sub split-pageid    ($string, :$useslash) { ... }

sub clean-pageid (@parts, :$sepchar) {
    @parts.map: { colon-normalize($_, :$sepchar) || Slip.new }
}
sub new-pageid ($string, :$useslash, :$sepchar) {
    my @parts = split-pageid($string, :$useslash);
    @parts = clean-pageid(@parts, :$sepchar)
}

And even then, it wasn’t clean. For example, useslash couldn’t be used immediately, it had to be preprocessed before you could use it as a function argument. This introduced defaults into a functional interface, which felt wrong.

When using this functional interface with a configuration object, there was a lot of redundant code just to pass the named arguments, such as startpage => $config.startpage. So instead of passing each configuration switch one by one, why not pass a configuration object ?

sub colon-normalize ($string, Config::ColonNormalize $config) { ... }
sub split-pageid    ($string, Config::SplitPageId    $config) { ... }

I was still trying to require a minimum interface at the time. So each function would take a custom configuration interface. I would have a custom role for each function and the main Config class would have to inherit from all of them.

How to properly require attributes: you don’t

As I was creating a few roles, and making Config inherit from them, I ran into a problem: attributes didn’t want to merge nicely. This is why, at first, I thought that I would need to have every config variable as its own role, because you can’t have attributes defined in different roles:

role Config-A {
    has $.startpage;
    has $.useslash
}
role Config-B {
    has $.startpage;
    has $.autoplural
}

class Config does Config-A does Config-B { }
# ERROR: Attribute '$!startpage' conflicts in role composition

I tried using methods instead, but that didn’t work either, as even though they looked the same, they were considered different:

role Config-A {
    method startpage { 'start' }
    method useslash { False }
}
role Config-B {
    method startpage { 'start' }
    method autoplural { False }
}

class Config does Config-A does Config-B { }
# ERROR: Method 'startpage' must be resolved by class Config
#        because it exists in multiple roles (Config-B, Config-A)

Since I didn’t really care about setting the default value in the roles, as they would always need to be merged manually, I realized that I really only needed to require that the methods exist.

role Config-A {
    method startpage { ... }
    method useslash { ... }
}
role Config-B {
    method startpage { ... }
    method autoplural { ... }
}

class Config does Config-A does Config-B {
    has $.startpage;
    has $.useslash;
    has $.autoplural;
}
# No errors

Turns out, the automatically generated methods for accessing attributes are taken into account when checking for methods required by roles. Moreover, they don’t conflict if both are stubbed.

In retrospect, I’ve learned that it’s not really worth the trouble to go with that design, as you’d end up creating custom roles for each combination of configuration options used, and composing them all into the main Config class, leading to a lot of inheritance that doesn’t mean much.

However, it is useful to have some smaller interfaces used by multiple function with a certain intent associated with them. For example, PageName::Config is the set of options that are used when dealing with pages and links. It puts things into context.

role PageName::Config {
    method startpage  { ... }
    method useslash   { ... }
    method autoplural { ... }
    method sepchar    { ... }
}

Haskell typeclasses and derived behaviour with roles

This made me think of Haskell typeclasses. I haven’t done any Haskell, but I remember reading that Raku’s roles could be used to implement them. (As of writing, the docs for typeclasses in Raku is still a work-in-progress: https://docs.raku.org/language/haskell-to-p6#Typeclasses)

You could require certain methods by defining as stubbed methods, and then implement your typeclass behaviour based on those. For example, this is how I implemented delim and start for configuration objects. They aren’t part of the data, but derived from it. Maybe this is what Rust’s traits do too ?

role ConfigLike {
    method startpage { ... }
    method start { self.startpage }
    
    method useslash { ... }
    method delim { (':', ';', ('/' if self.useslash)) }
}

class Config does ConfigLike {
    has $.startpage = 'start';
    has $.useslash = True;
}

say Config.new.delim; # OUTPUT: (: ; /)

Weird thing: cloning stubbed roles

Look at the following code. We’re defining an abstract role, and we try to clone the type object.

role A {
    method poke { ... }
}

A.clone;
# ERROR: Method 'poke' must be implemented by A
         because it is required by roles: A.

The compiler replies that the clones object should implement the stubbed method, although it is a type object. I’m just, quite confused by this, and I don’t really know if this is intended behaviour, or if it does need to implement it for it to be logically sound.

An operator to conditionally pass named arguments

So when I was passing named arguments through successive function calls, I would often pass an undefined value through 3 functions. The issue was that I had a default set at the last function’s signature like so: :$option = 'default', but it wouldn’t get assigned since I was actually passing a type object (think null in other languages), and it considered that to be defined.

To prevent that, I would need to check before each function call, that the value was defined, and to pass it only if it was defined. But I didn’t want to write an if statement for each function call, duplicating the function call just because it was undefined.

There’s a feature in Raku which flattens lists into a list of arguments. In practice, you can add all your function parameters into a list, and then call the function like so: function(|@arguments). By prefixing the list with the vertical pipe character, the list would be flattened and the function would see multiple arguments, instead of a single argument which happens to be a list.

Although the | pipe operator is also used to flatten lists, it really is a different operator when used in argument lists.

Now, if you simply try to defined a function that takes a named argument Pair, and returned it if the value was defined, and nothing otherwise, it wouldn’t work. Also, the syntax works because if statements evaluate to the value of the executed block if the condition is true, and to the Empty slip otherwise.

sub with-val (Pair:D $p) { $p if $p.value.defined }
with-val named => 'arg';
# ERROR: Too few positionals passed; expected 1 argument but got 0

We get an error because our pair is interpreted as a named argument to the function. We could rewrite the function to slurp all named arguments (with *% in the signature), but I prefered to make it into an operator.

Well, honestly, I didn’t know that the behaviour would change if I made it into an operator. What I was trying to do, however, was to have the precedence of the with-val operator to be higher than a comma, so there wouldn’t be extra parenthesing. This is really easy to do in Raku, just add is tighter(&infox<,>) to the function declaration.

sub prefix:<with-val> (Pair:D $pair) is tighter(&infix<,>) {
    $pair if $pair.value.defined
}

Although the function returns correctly, I couln’t find a way to make function with-val :$named-arg work as intended. with-val would return a Pair object, and I couldn’t find a way for it to have Slip behaviour, thus the earlier note: Slips and argument flattening are different things, and you can’t return an object that has the argument list flattening effect.

This is why the usage is actually function |with-val :$named-arg. I think that |with-val kinda looks like an actual operator, so it works out in the end.

Negating custom character classes

In Raku regexes, you can use <-[aeiou]> to negate the character class <[aeiou]>. This would match all characters that aren’t in aeiou. You can also negate predefined character classes like <upper> by adding a minus sign after the angle bracket as so: <-upper>. This behaviour isn’t (well) documented, but it is tested.

Another way to think about negated character classes, is that <-alpha> is equivalent to <!before <alpha>> . (from https://design.raku.org/S05.html#line_1826). This will be useful later.

What about custom character classes then ? You can’t define a named character class and use it outside of the regex. You can however, make a named capture out of a character class with the <foo=[aeiou]> syntax. That is useful when writing Raku grammars.

Actually, it might be possible to define custom character classes, but it wasn’t documented, nor simple to find out how.

To call an external regex from yours, you have 3 options: call a lexically scoped regex with <&bar>, call a rule within a grammar <Grammar::rule>, or interpolate a regex in with $($regex).

Calling lexically scoped regexes is briefly mentionned in Regexes §Subrules, calling a rule from a grammar isn’t documented anywhere, and I only know of it from a Reddit comment. There’s also <::($somename)> dynamic lookup and maybe other ways, but I can’t think of them.

Since I wanted to define the character class, but use its negation, I ran into a problem. The first two methods didn’t work because <-&foo> and <-G::foo> aren’t valid syntax.

Simple interpolation (<-$foo>) didn’t work either. However, you could work around it by using the equivalent form mentionned earlier: <!before $foo> .. I considered it, but considered that it was not clear enough.

I ended up settling on defining a grammar, in which I defined a charclass method, and a not-charclass method as such:

grammar G {
    token not-charclass { <-charclass> }
    token charclass { <[ . \- _ ] + [a..z A..Z] + [0..9]> }
}

This works because in a grammar, you can call other methods by name between angle brackets. Since there’s a single special character <-charclass> instead of multiple for <-&charclass> and <-$charclass>, the compiler hapilly calls the right function. Probably.

Now that we have have the right regex, we can just call it in our code by using the rule-in-grammar syntax: / <G::not-charclass> /. Done !

Testing helps break code with confidence

This was the first code project where I got mileage out of testing my code. At one point I was breaking all the modules, and rewrote how they interacted with each other. Since I had testing suite, if it worked, then I reimplemented the features correctly.

The really nice part of using testing suites, is that it can test for edge cases. Since we usually don’t run into edge cases in normal code, it can lead to bugs being found much later. With a test suite, that doesn’t really happen.

Being able to test the code so easily was helped by Perl’s culture of having tests, and by the fact that the Test module is in the standard library, and really simple to use.

Finally, if you only commit code when your Test suite more or less works, you can always just rollback to the latest working version, and restart from there.

Exporting packages in Raku isn’t too intuitive

Exporting functions is simple, you add is export to your function declaration and you’re good to go. You can also export classes the same way: class Config is export { ... }. But trouble arrives when you’re using namespaced classes in a module.

Exporting a namespaced class

Say you declare a class Some::Class inside the module My::Module. If you add the export trait to that class, it would export it as simply Class. In any case, you can still access it by the name of My::Module::Some::Class.

At first, I wanted it to export it as Some::Class, because I had defined another class in the same file with the same name, say Another::Class. And if I tried exporting them both, the compiler would tell me that Class has already been exported, and that I can’t export it twice.

A workaround I found, was to make use of the EXPORT function. If you define an EXPORT function in your module file, you can control which symbols get exported. This is useful to export short names of classes for example.

sub EXPORT {
    %(
        'SomeClass' => Some::Class,
        'AnotherClass' => Another::Class
    )
}

With this example, I exported aliases to my classes. When including this module, you would use SomeClass to refer to My::Module::Some::Class. The two Class types don’t conflict anymore.

But what I wanted was to be able to import the module, and have Some::Class be an alias to My::Module::Some::Class. We can do this by aliasing the namespaces.

sub EXPORT {
    %(
        'Some' => Some,
        'Another' => Another,
    )
}

We have now aliased Some to My::Module::Some, which means that Some::Class now refers to My::Module::Some::Class. This is a bad idea because you can overwrite other namespaces.

A cleaner way would be to ditch the module entirely, and define the classes directly in your file, outside of any module. Since subroutines are still file scoped, you won’t pollute the caller namespace when importing the file. This is explained a bit in the Raku docs. Not sure if it’s clear though.

Exporting namespaces that already exist is wonky

Okay, so there’s an exception to what I said earlier. There is a case where you can “export” namespaced classes from a module. That is when the class namespace is a subpackage of the current namespace. So in the module Cat, defining a class Cat::Type would automatically export it. You would then be able to access it as Cat::Type and not Cat::Cat::Type.

That is a quite useful although a bit weird quirk. But it doesn’t stop there. If the first namespace of you module is the same as the first namespace of your class, then that also get exported. For example, if, in the module Open::Street::Map, you define a class Open::Whisper::Systems, then when you include the module, Open::Whisper::Systems would be an alias to Open::Street::Map::Open::Whisper::Systems.

That sounds quite problematic, because of namespace ownership. There’s a bug report “Symbols that start with core namespaces always get exported” from January 2018, that mentions similar weird behaviour with namespaces.

I didn’t understand: META6.json’s provide key

I was reading up on how to structure a module for it to be redistributed in the ecosystem. One important file is META6.json, which contains metadata about the package. There are several keys which are documented roughly, and one of them is the provides key.

In the provides section, include all the namespaces provided by your distribution and that you wish to be installed; only module files that are explicitly included here will be installed and available with use or require in other programs. This field is mandatory.

What I understand from that, is that you can have your module in file named Wiki/DokuWiki/Utils/ResolveName.rakumod in your source repository, but it will be renamed to DokuWiki/ResolveName.rakumod when you install it.

What I don’t understand are the implications of it. Is it just a way to allow library directories to not be named lib, while still being able to be found by the package manager ? What are the implications of renaming the file, would I still be able to use the old file name within my own package, like a symbolic link?

Finally, I was thinking that it could be a way to signal to the package manager that you are claiming namespaces within the files (by defining namespaced classes), so that users won’t be surprised by them?

I didn’t understand: Pseudo-packages

When I was still trying to export namespaced classes from a module, I tried messing around with pseudo-packages, notably CALLER and OUTER. And, I have to confess, I have no idea what they do. An example in the docs would be very nice to have.

Wrapping up

This took longer to code than I would have wanted. I think it was worth it in the end. It was long enough that I could feel the weight of my design decisions and coding style.

This blog post is quite long so I don’t really expect you to read all of it, but if you did, thank you!