Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update error, duplicate key issue #101

Open
audiodude opened this issue Sep 4, 2019 · 5 comments
Open

Update error, duplicate key issue #101

audiodude opened this issue Sep 4, 2019 · 5 comments
Assignees
Labels

Comments

@audiodude
Copy link
Member

When updating project 'Football', we got the following stack trace:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 822, in perform_job
    rv = job.perform()
  File "/usr/local/lib/python3.7/site-packages/rq/job.py", line 605, in perform
    self._result = self._execute()
  File "/usr/local/lib/python3.7/site-packages/rq/job.py", line 611, in _execute
    return self.func(*self.args, **self.kwargs)
  File "./wp1/logic/project.py", line 68, in update_project_by_name
    update_project(wikidb, wp10db, project)
  File "./wp1/logic/project.py", line 468, in update_project
    extra_assessments['extra'])
  File "./wp1/logic/project.py", line 247, in update_project_assessments
    process_unseen_articles(wikidb, wp10db, project, old_ratings, seen)
  File "./wp1/logic/project.py", line 357, in process_unseen_articles
    move_data['timestamp_dt'])
  File "./wp1/logic/page.py", line 53, in update_page_moved
    logic_move.insert(wp10db, new_move)
  File "./wp1/logic/move.py", line 34, in insert
    ''', attr.asdict(move))
  File "/usr/local/lib/python3.7/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.7/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/usr/local/lib/python3.7/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.IntegrityError: (1062, "Duplicate entry '0-Andrei_Ra\\xC5\\xA3iu-2019-08-31T08:58:55Z' for key 'PRIMARY'")

This looks like a move insert that had already been processed. We're suppposed to be doing insert_or_update, but there must be a flaw in the logic.

@audiodude audiodude added the bug label Sep 4, 2019
@kelson42 kelson42 pinned this issue Sep 10, 2019
@kelson42
Copy link
Collaborator

@audiodude Still a bug?

@audiodude
Copy link
Member Author

@kelson42 I haven't seen this one happen recently. We could close it as not reproducible, and wait for it to happen again to re-open.

@audiodude
Copy link
Member Author

And of course, as I look in the logs I find the following (Murphy's Law):

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 822, in perform_job
    rv = job.perform()
  File "/usr/local/lib/python3.7/site-packages/rq/job.py", line 605, in perform
    self._result = self._execute()
  File "/usr/local/lib/python3.7/site-packages/rq/job.py", line 611, in _execute
    return self.func(*self.args, **self.kwargs)
  File "./wp1/logic/project.py", line 75, in update_project_by_name
    update_project(wikidb, wp10db, project)
  File "./wp1/logic/project.py", line 505, in update_project
    update_project_assessments(wikidb, wp10db, project, extra_assessments)
  File "./wp1/logic/project.py", line 269, in update_project_assessments
    process_unseen_articles(wikidb, wp10db, project, old_ratings, seen)
  File "./wp1/logic/project.py", line 395, in process_unseen_articles
    move_data['timestamp_dt'])
  File "./wp1/logic/page.py", line 53, in update_page_moved
    logic_move.insert(wp10db, new_move)
  File "./wp1/logic/move.py", line 34, in insert
    ''', attr.asdict(move))
  File "/usr/local/lib/python3.7/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.7/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.7/site-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/usr/local/lib/python3.7/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.IntegrityError: (1062, "Duplicate entry '0-Vishal\\xE2\\x80\\x93Shekhar-2019-10-18T16:46:32Z' for key 'PRIMARY'")

@audiodude
Copy link
Member Author

It looks like it has something to do with non-ASCII characters

@kelson42
Copy link
Collaborator

This bug is going to be a blocker I believe if we want to consider using the WP1 engine with a few other Wikipedias. Probably the test should be extended to secure the WP1 engine can deal properly with accented characerts. With MySQL usually a few things need to be checket:

  • tables charset/collation
  • DB default charset/collation
  • Connection encoding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants