-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discord data parsing error #63
Comments
It should work fine, as seen here. Note that the regex you pasted in the issue is rendered incorrectly, here's the actual regex just for context- r'\[.+\] (?!' + re.escape(personName) + r').+\n(.+)\n{2}(?:\[.+\] ' + re.escape(personName) + r'\n(.+)\n{2})' It should be noted that the dataset training works on response sets. This means that the regex captures your ( If the text history looked like-
you will now need to use If you're still having trouble, it is highly likely the regex is not the issue - please provide a small (and preferably censored) version of the chatlog you're trying to parse, along with your input for |
it is working but gives only 1 pair of response even though i have approximately 70 dms. Do i need to change the chats format in some way? Please tell what i should do to retrieve the chats |
As long as they are in the format of a response - as in, another person's message followed by your message - it should be parsed correctly. Ensure you set |
https://imgur.com/a/aeTAZUW also did u guys take into account that one person cud have sent more than one message in one go? |
Unfortunately, the training is only capable of working with atomic response sets - that is, one reply to one statement. But as long as there are multiple response sets in your chatlog - it should still work. Also, the screenshot does not show your inputs to the script - so I'm unsure where the 733 length of dictionary is coming from. Are you exporting messages from multiple sources? As a side note, please do not post images of debug text data. I cannot really copy text from an image. You may try pasting your chatlog in a regex tester, such as the link I posted before, and checking the matches. (make sure to alter the person name if you need to). I don't think it's a problem with the regex per se - perhaps there's something I'm missing. But this is the first time this issue has been encountered. |
actually the 733 thing occurs coz i am also using whatsapp data which has no problem, it has been extracted successfully and its length is 732 and only 1 has been extracted from discord chats... Even i am confused as to what the problem is.. it works fine on the regex tester but stopd working when implemented |
Hi i am having some difficulty while creating the dictionary of my friends and my messages, there seems to be a problem with the regex used n this code
response_sets = re.findall(r'[.+] (?!' + re.escape(personName) + r').+\n(.+)\n{2}(?:[.+] ' + re.escape(personName) + r'\n(.+)\n{2})', data)
this is what has been used
but it returns a blank dictionary
[08-Oct-20 02:40 PM] ShadowRanger5#3348
hello
[08-Oct-20 03:00 PM] sai#2795
Hi wassup
this is what my data looks like after i have formatted but using the above regex i cant seem to create a dictionary to extract my friends and my conversations
It is possible that this was made keeping in mind older versions of the discord chats parser among many other things that are a little outdated in this repository (seq2seq model and some code of the word2vec)
would appreciate if anybody can come up with a solution for this
The text was updated successfully, but these errors were encountered: