-
-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming smart contract events and topics (ABI processing in python instead of BQ) #50
Comments
Another optimisation idea is to separate parsing into 2 steps:
|
We are going to encounter this in on other chains that support smart contracts in general, and multiple networks that support EVM-compatible contracts specifically. Here are a few examples: I propose that we abstract this pattern above Ethereum. |
Related issues, with emphasis on stream processing. |
Another related issue for indexing: #28 Here the proposal is to cluster/partition the results in BQ to reduce IO. I suggest we take this a step further to generally support all contracts for which we have parsing capability (i.e. those for which we have an ABI). Specifically, at transaction processing time in the stream:
The overall effect of this design:
|
Regarding this:
we need to think about how we process historical data for newly added ABIs, so that this data appears in the |
@sinahab @epheph @jieyilong FYI we are designing a generic solution for ABI event generation in this issue and also blockchain-etl/ethereum-etl#216 |
Awesome, will take a look. Thanks!
…On Thu, May 21, 2020 at 10:00 AM Allen Day ***@***.***> wrote:
@sinahab <https://github.com/sinahab> @epheph <https://github.com/epheph>
@jieyilong <https://github.com/jieyilong> FYI we are designing a generic
solution for ABI event generation in this issue and also
blockchain-etl/ethereum-etl#216
<blockchain-etl/ethereum-etl#216>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#50 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEMSYCS6MLS6MKYMHQUAXJTRSVM4RANCNFSM4KPSNCQA>
.
--
Jieyi Long
CTO, Theta Labs, Inc
https://www.ThetaLabs.org
|
Disclosure: I'm very new to this project... so pardon me if I missed or misunderstood things :) Here are my thoughts, in no particular order, on the proposal by @allenday above. I focused more heavily on the use case I'm trying to solve for Origin Protocol since that's the one I'm the most familiar with.
|
Right now we have ~500 events that we parse. Every day ~500MB of log data is generated. This sums up to 250GB parsed in BigQuery daily, ~7.5 TB per month. Which totals to ~$37 per month.
With 1500 events we'll spend ~$100 per month.
An alternative to parsing logs in BigQuery is export JSON file to GCS download locally in Airflow and filter all events in a dataset at once, then load to BigQuery (free). There is PoC for how to parse logs in Python here blockchain-etl/ethereum-etl@6710e6b.
The text was updated successfully, but these errors were encountered: