Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issues with larger schemas #1312

Closed
ayalon opened this issue Nov 9, 2022 · 9 comments · Fixed by #1379
Closed

Performance Issues with larger schemas #1312

ayalon opened this issue Nov 9, 2022 · 9 comments · Fixed by #1379
Assignees

Comments

@ayalon
Copy link

ayalon commented Nov 9, 2022

We built a larger schema with the new module and faced some severe performance issues.

I did several hours profiling and figured out multiple issues, that could easily be improved.

As a starting point, I profiled a fully cached GraphQL request with development mode turned off. This looked like this:

image

On every request, the schema is validated by the webonyx library even though it comes from the cache. This can be easily turned off
File SdlSchemaPluginBase::getSchema()

    $options = ['assumeValid' => true];
    $schema = BuildSchema::build($document, function ($config, TypeDefinitionNode $type) use ($resolver) {
      if ($type instanceof InterfaceTypeDefinitionNode || $type instanceof UnionTypeDefinitionNode) {
        $config['resolveType'] = $resolver;
      }

      return $config;
    }, $options);

There is an option assumeValid and if passed, the validation is skipped. This saved me approximatly 300ms per request.

Additionally, to support larger ASTs we also had to add the option 'noLocation' to the Parser::parse() function to avoid recursion loops.
This also gives a performance boost because the resulting AST is substantially smaller.

protected function getSchemaDocument(array $extensions = []) {

...

    $options = ['noLocation' => TRUE];
    $ast = Parser::parse(implode("\n\n", $schema), $options);
    if (empty($this->inDevelopment)) {
      $this->astCache->set($cid, $ast, CacheBackendInterface::CACHE_PERMANENT, ['graphql']);
    }

...

I went a head and figured out, that the performance is still an issue. I think there is a serious flaw in the implementation of the SchemaExtender. Here is a timeline when extending the schema and after fixing the issues above:

image

As we can see, a function is called, named \GraphQL\Type\Schema::getTypeMap(). The method decription looks like this:

    /**
     * Returns array of all types in this schema. Keys of this array represent type names, values are instances
     * of corresponding type definitions
     *
     * This operation requires full schema scan. Do not use in production environment.
     *
     * @return array<string, Type>
     *
     * @api
     */
    public function getTypeMap() : array

The maintainer warns us, to run this function in production. But we run it on every request if schema extensions are turned on. :-)

Therefore, I implemented a caching of the extended schema:

...
    if ($extendSchema = $this->getExtensionDocument($extensions)) {
      // Generate the AST from the extended schema and save it to the cache.
      // This is important, because the Drupal graphql module is not caching the extended schema.
      // During schema extension, a very expensive function \GraphQL\Type\Schema::getTypeMap() is called.
      $document = $this->getExtensionSchemaAST($schema, $extendSchema);
      $options = ['assumeValid' => TRUE];
      $extended_schema = BuildSchema::build($document, function ($config, TypeDefinitionNode $type) use ($resolver) {
        if ($type instanceof InterfaceTypeDefinitionNode || $type instanceof UnionTypeDefinitionNode) {
          $config['resolveType'] = $resolver;
        }
        return $config;
      }, $options);
      return $extended_schema;
    }

...

  public function getExtensionSchemaAST($schema, $extendSchema) {
    $cid = "schema_extension:{$this->getPluginId()}";
    if (empty($this->inDevelopment) && $cache = $this->astCache->get($cid)) {
      return $cache->data;
    }

    $schema = SchemaExtender::extend($schema, $extendSchema);
    $schema_string = SchemaPrinter::doPrint($schema);
    $options = ['noLocation' => TRUE];
    $ast = Parser::parse($schema_string, $options);
    if (empty($this->inDevelopment)) {
      $this->astCache->set($cid, $ast, CacheBackendInterface::CACHE_PERMANENT, ['graphql']);
    }

    return $ast;
  }

The first time the schema is extended, we save the resulting AST from the cache. On any subsequent request, we can get the AST from the cache and load it into our BuildSchema::build(). This is superfast, because BuildSchema::build is lazy-loading our types and is not scanning the whole schema.
See documentation here: https://webonyx.github.io/graphql-php/schema-definition-language/#performance-considerations

This changes will result in a massive performance improvement. In total, we saved more than 400ms on cached each request.

image

I will create a pull-request to address this issues.

@ayalon
Copy link
Author

ayalon commented Nov 9, 2022

Here is the related pull-request:
#1314

@pfrenssen
Copy link

Thanks for creating this issue! I am wondering though if it wouldn't be better to have separate issues and PRs for each of the identified performance problems? It can be more difficult to assess the impact if multiple issues are addressed at once.

@Kingdutch
Copy link
Contributor

Thanks for the extensive analysis! I agree with pfrenssen that this would be easier to review and assess with PRs for each individually discovered issue.

There has been some previous discussion around caching in #948 and I'd definitely be curious to hear how your changes play with anonymous functions, since we haven't gotten rid of those yet.

Additionally from what I see in your PR right now it appears you disable schema validation completely, rather than only for cached schema's?

@Sam152
Copy link

Sam152 commented May 26, 2023

I did some profiling and found some of the same problems @ayalon found. I applied his PR as a patch wholesale and it halved the time most queries took on my site.

@Kingdutch Kingdutch self-assigned this Jul 26, 2023
@Kingdutch
Copy link
Contributor

I hope to take a stab at this in the near future and provide a slightly different method of caching the schema with extensions.

Kingdutch added a commit that referenced this issue Jul 27, 2023
Issue #1312 provided some good pointers of performance considerations.
One of which was not parsing the schema on every request. We've
previously tried to cache the AST in the database but the representation
just isn't very well suited for it. Instead we should use the PhpStorage
provided by Drupal to just cache the entire AST as an array on disk
which can be loaded quickly.

See https://webonyx.github.io/graphql-php/schema-definition-language/#performance-considerations

This commit makes an example proof of concept of what this could look
like but it has a few problems:

- For some reason the extensions (e.g. `extend type Query`) don't get
  parsed properly which makes our validator think the field has a
  resolver but no schema. This might be a bug in the validator but
  that's unclear for now.
- We make breaking changes to methods in SdlShcemaPluginBase which could
  be undesirable and might need some thought and care to make this
  forward compatible.
- There's no good way to get an AST from a Schema so we perform some
  hacks. However it looks like the underlying library actually loses
  some of our Source information. The whole point of processing each
  .graphql file indepdently was so we could get better error messages
  of where syntax errors exist. This might be fixed in newer versions
  of the graphql-php library though
- We must perform an extension sorting step, at least with the Open
  Social schema, to make sure that module dependencies are properly
  handled. For example a .base.graphql file might not `extend` something
  but might still reference a type defined in another module's
  .base.graphql file. If the files are processed out of order then this
  causes an error.
  Even if this is fixed with the sorting by module dependencies then
  this might still be a breaking change for people who do not currently
  build their schema with these constraints in mind (i.e. properly
  architecting their schema's dependencies).
@klausi
Copy link
Contributor

klausi commented Nov 12, 2023

The noLocation hack was committed in #1364.

@klausi
Copy link
Contributor

klausi commented Nov 12, 2023

I think the simple approach in #1314 makes a lot of sense, thanks! I reworked it a bit in #1379, please test!

@Kingdutch I think this is a good intermediary step before we explore your more sophisticated performance improvements with the AST on disk.

@klausi
Copy link
Contributor

klausi commented Nov 12, 2023

Initial performance testing in Jobiqo with a lot of schema extensions shows 30-40% full request time improvements, yay! Will do a full test for regressions next.

@ayalon
Copy link
Author

ayalon commented Nov 12, 2023

Thanks for looking into this issue. All the improvement I have posted are already part of the graphql_core_schema modules since November 2022. This module is intensively used and tested.
https://www.drupal.org/project/graphql_core_schema

If all the changes, especially the caching of the schema extensions can go back into the main graphql module, we are happy to remove the changes from the graphql_core_schema. Thanks for working on this @klausi .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants