Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify metrics using reporter approach #138

Open
kkrugler opened this issue Apr 25, 2018 · 2 comments
Open

Verify metrics using reporter approach #138

kkrugler opened this issue Apr 25, 2018 · 2 comments
Assignees

Comments

@kkrugler
Copy link
Member

From the Flink mailing list:

+1 to using reporters.

You will have to explicitly pass a configuration with the reporter settings to the environment via StreamExecutionEnvironment#createLocalEnvironment(int, Configuration).

The reporter can verify registrations/values and pass this information back to the main test thread through a static field (for simplicity).

@Schmed
Copy link
Member

Schmed commented Sep 19, 2018

I assume you haven't done anything on this yet, @vmagotra, so I'm going to take a crack at it.

I'm hoping this will lead the way toward a unit test for Sync up the total active urls value with state in UrlDbFunction (as @kkrugler suggested via a comment on Try using stream harness support for unit testing).

@Schmed
Copy link
Member

Schmed commented Sep 20, 2018

For some reason, I'm getting the following failure when running locally (master + one unit test class I added):

schmed-mb-air-2:flink-1.5.2 schmed$ ./bin/flink run ~/Projects/flink-crawler/target/flink-crawler-tool-1.0-SNAPSHOT.jar -commoncrawl 2017-22 -cachedir ~/Downloads/flink-crawler/common-crawl-cache -seedurls ~/Downloads/flink-crawler/common-crawl-seed-urls.txt -forcecrawldelay 0 -maxcontentsize 100000 -outputfile ~/Downloads/flink-crawler/common-crawl-content.txt
Starting execution of program
Error running CrawlTool: Could not load the TypeInformation for the class 'org.apache.hadoop.io.Writable'. You may be missing the 'flink-hadoop-compatibility' dependency.
java.lang.RuntimeException: Could not load the TypeInformation for the class 'org.apache.hadoop.io.Writable'. You may be missing the 'flink-hadoop-compatibility' dependency.
at org.apache.flink.api.java.typeutils.TypeExtractor.createHadoopWritableTypeInfo(TypeExtractor.java:2140)
at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1759)
at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1701)
at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:956)
at org.apache.flink.api.java.typeutils.TypeExtractor.createSubTypesInfo(TypeExtractor.java:1176)
at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:889)
at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:969)
at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:831)
at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:625)
at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:191)
at org.apache.flink.streaming.api.datastream.DataStream.flatMap(DataStream.java:594)
at com.scaleunlimited.flinkcrawler.topology.CrawlTopologyBuilder.build(CrawlTopologyBuilder.java:412)
at com.scaleunlimited.flinkcrawler.tools.CrawlTool.run(CrawlTool.java:121)
at com.scaleunlimited.flinkcrawler.tools.CrawlTool.main(CrawlTool.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025)
at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants