Skip to content

Migrate all 3C JSON generation to use the LLVM JSON library #620

Open
@mattmccutchen-cci

Description

@mattmccutchen-cci

We should migrate all 3C code that generates JSON to use the LLVM JSON library rather than manual string concatenation, which is prone to bugs like #605 and #619.

The main decision we need to make is which version of the LLVM JSON API is most suitable for our purpose. If I understand correctly, our options are (from lowest to highest level):

  1. llvm::json::OStream begin/end. Example from the API documentation:
    /// json::OStream J(OS);
    // J.arrayBegin();
    /// for (const Event &E : Events) {
    /// J.objectBegin();
    /// J.attribute("timestamp", int64_t(E.Time));
    /// J.attributeBegin("participants");
    /// for (const Participant &P : E.Participants)
    /// J.value(P.toString());
    /// J.attributeEnd();
    /// J.objectEnd();
    /// }
    /// J.arrayEnd();
  2. llvm::json::OStream lambda. Example from the API documentation:
    /// json::OStream J(OS);
    /// J.array([&]{
    /// for (const Event &E : Events)
    /// J.object([&] {
    /// J.attribute("timestamp", int64_t(E.Time));
    /// J.attributeArray("participants", [&] {
    /// for (const Participant &P : E.Participants)
    /// J.value(P.toString());
    /// });
    /// });
    /// });
  3. Constructing an llvm::json::Value that we can ultimately send to an output stream with <<. One example from clangd (there are many other examples in the same file demonstrating different things):
    llvm::json::Object Diag{
    {"range", D.range},
    {"severity", D.severity},
    {"message", D.message},
    };
    if (D.category)
    Diag["category"] = *D.category;
    if (D.codeActions)
    Diag["codeActions"] = D.codeActions;
    if (!D.code.empty())
    Diag["code"] = D.code;
    if (!D.source.empty())
    Diag["source"] = D.source;
    if (D.relatedInformation)
    Diag["relatedInformation"] = *D.relatedInformation;

The style of our existing code is most similar to (1), but I'm guessing that was only because we didn't want to do the work to introduce the abstractions needed for (2) or (3), not because we thought they would be bad. Migrating to (2) or (3) would mean more churn now but may make the code much easier to maintain in the long run.

Currently, a bunch of 3C objects have a T::dumpJson(llvm::raw_ostream &) method. Obviously, we'll have to change the interface of that method so that it can run in the middle of the dumping of a larger object via whichever JSON API version we choose. If we use version (3) and are willing to move to the JSON library's standard interface, which is a non-member function llvm::json::Value toJSON(const T &), then the llvm::json::Value constructor will call the toJSON function on sub-objects implicitly for us, rather than us having to explicitly call our own function (dumpJson or whatever we name it). The clangd example above is using toJSON; if it didn't, it would have to do something like Diag["codeActions"] = D.codeActions.dumpJson(), etc.

If we don't foresee JSON output becoming a performance bottleneck anytime soon, I think we're probably best served by migrating all the way to (3) with toJSON, but I'd welcome other thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions