Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARQL construct NiFi processor is broken when producing one-to-many messages using the graph splitting technique #664

Closed
rorlic opened this issue Jul 11, 2024 · 1 comment · Fixed by #724
Assignees
Labels
bug Something isn't working component: nifi Issues related to LDI NiFi Processors needs triage Issue needs to be evaluated by team

Comments

@rorlic
Copy link
Contributor

rorlic commented Jul 11, 2024

When a SPARQL Contruct processor is configured to split a linked data model using graphs, errors are thrown preventing the pipeline to continue.

To Reproduce
(see next comment for test setup)

  1. Download the RML adapter processor and the SPARQL Construct processor in the local NiFi extensions folder

  2. Start NiFi workbench:

    clear
    docker compose up -d --wait
  3. Log on to the NiFi workbench at https://localhost:8443/nifi using the credentials found in the .env file

  4. Import the pipeline (create process group & browse for the this pipeline)

  5. Start the pipeline

  6. Process the data:

    curl -X POST -H "Content-Type: text/csv" http://localhost:8080/pipeline --data-binary @./data.csv
  7. Verify that the SPARQL Construct processor issues errors:

    2024-07-11 09:21:36,928 WARN [Timer-Driven Process Thread-2] o.a.n.controller.tasks.ConnectableTask Processing halted: uncaught exception in Component [SparqlConstructProcessor[id=1e1bc70f-4f13-36fd-61c4-5dffa7f88225]]
    org.apache.nifi.processor.exception.FlowFileHandlingException: StandardFlowFileRecord[uuid=4d37a9ae-4269-434c-ba28-b76f9207db5f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1720689654679-1, container=default, section=1], offset=1627, length=5892],offset=0,name=4d37a9ae-4269-434c-ba28-b76f9207db5f,size=5892] is not known in this session (StandardProcessSession[id=43])
            at org.apache.nifi.controller.repository.StandardProcessSession.validateRecordState(StandardProcessSession.java:3714)
            at org.apache.nifi.controller.repository.StandardProcessSession.validateRecordState(StandardProcessSession.java:3700)
            at org.apache.nifi.controller.repository.StandardProcessSession.transfer(StandardProcessSession.java:2351)
            at be.vlaanderen.informatievlaanderen.ldes.ldi.processors.services.FlowManager.sendRDFToRelation(FlowManager.java:84)
            at be.vlaanderen.informatievlaanderen.ldes.ldi.processors.services.FlowManager.sendRDFToRelation(FlowManager.java:73)
            at be.vlaanderen.informatievlaanderen.ldes.ldi.processors.SparqlConstructProcessor.onTrigger(SparqlConstructProcessor.java:61)
            at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
            at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274)
            at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244)
            at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
            at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
            at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
            at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
            at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
            at java.base/java.lang.Thread.run(Thread.java:1583)
    

Expected behavior
Multiple flow files should be created and the pipeline should continue.

@rorlic rorlic added the needs triage Issue needs to be evaluated by team label Jul 11, 2024
@github-project-automation github-project-automation bot moved this to 📋 Backlog in VSDS Backlog Jul 11, 2024
@rorlic
Copy link
Contributor Author

rorlic commented Jul 11, 2024

Test setup attached: ldio.gh#664.zip

@rorlic rorlic added bug Something isn't working component: nifi Issues related to LDI NiFi Processors labels Jul 11, 2024
@jobulcke jobulcke self-assigned this Nov 26, 2024
@jobulcke jobulcke linked a pull request Nov 28, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from 📋 Backlog to 👀 In review in VSDS Backlog Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component: nifi Issues related to LDI NiFi Processors needs triage Issue needs to be evaluated by team
Projects
Status: 👀 In review
Development

Successfully merging a pull request may close this issue.

2 participants