Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open questions #19

Open
Shuyib opened this issue Mar 17, 2020 · 3 comments
Open

Open questions #19

Shuyib opened this issue Mar 17, 2020 · 3 comments
Labels
question Further information is requested

Comments

@Shuyib
Copy link
Collaborator

Shuyib commented Mar 17, 2020

Hi,

I've just seen that the repo has been updated to reflect changes till now. I have several questions on the way forward.

  • Are we going to make an application for this? What if someone wants to do this in future? I'm thinking ipywidgets could help.

  • Are we going to do use Natural language Processing to look at the main topics being posted on the papers?

What do you think? @kipkurui

@Shuyib Shuyib added the question Further information is requested label Mar 17, 2020
@Shuyib Shuyib changed the title Looking into the content Open questions Mar 17, 2020
@Shuyib
Copy link
Collaborator Author

Shuyib commented Mar 20, 2020

I'll do it anyway. When I get the time.

@kipkurui
Copy link
Contributor

Hi @Shuyib that is a wonderful Idea. It would be fun to have a widget that would provide real-time visualisation. Is that what you had in mind. For natural language processing, which kind of question can we prioritise?

  • Are we analysing the abstract only? It can be tricky to get the full text, especially or closed access journals.

@Shuyib
Copy link
Collaborator Author

Shuyib commented Mar 20, 2020

Yes, we'll need to wrap everything in a while loop at the querying and returning the results + timing in between. I think it would be better to make one version where the user can just query what they want. I can wrap that with ipywidgets.

For NLP, I'll use Non-negative matrix factorization which is an unsupervised learning method. We can adjust the K that is, the number of clusters to potentially find out the topics in the abstracts only. Unfortunately. Just a by the way, we could use this in the next session mybinder

How does that sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants