JIICEU 2024 - NATAN DE SOUZA RODRIGUES, BARTOLOMEU SPEGIORIN GUSELLA, GUSTAVO HENRIQUE COSTA PINTO
Summary
TLDRThis presentation discusses the development of solutions for author name disambiguation in digital bibliographic repositories. The project, led by Gustavo Henrique Costa Pinto and Professor Nathan de Souza Rodrigues, focuses on creating a user-friendly interface for efficiently resolving the issue of authors sharing the same name in academic databases. The project explores multi-strategy approaches combining machine learning, natural language processing, and network analysis to improve the accuracy of disambiguation. With promising early results, the next steps involve refining the system and validating its effectiveness through further testing and publication in academic journals.
Takeaways
- π The project focuses on solving author name ambiguity in digital bibliographic repositories, such as dblp and arXiv.
- π The issue of author name ambiguity arises from homonymous authors, causing confusion in academic databases.
- π Gustavo Henrique Costa Pinto and Nathan de Souza Rodrigues, both first-year students at the University of GoiΓ‘s, are leading the research.
- π The project uses a multi-strategy approach combining heuristic methods and AI techniques like machine learning and natural language processing.
- π The first stage of the project successfully developed a user-friendly graphical interface to enhance existing author name disambiguation tools.
- π The interface allows users to upload data files in XML or JSON format or retrieve author data from online repositories like dblp.
- π The main objective of the project is to improve the flexibility and accuracy of author name disambiguation for more effective academic searches.
- π The user interface was developed using Java Swing to ensure compatibility with the multi-strategy approach based on Java.
- π The initial results have shown promise, with the system successfully processing author name disambiguation in a more accessible way for users.
- π The next steps in the project involve refining the disambiguation techniques and validating the system through testing and comparison with existing solutions.
Q & A
What is the main issue addressed in this project?
-The main issue addressed in this project is author name ambiguity in digital bibliographic repositories, which occurs when different authors share the same name, creating difficulties in accurately identifying them in academic searches.
Why is author name disambiguation important for digital bibliographic repositories?
-Author name disambiguation is crucial because it ensures accurate identification of authors in digital repositories, making it easier for users to find relevant academic work. Without disambiguation, searches may yield incorrect or incomplete results due to homonyms.
What are some examples of digital bibliographic repositories mentioned in the script?
-Examples of digital bibliographic repositories mentioned include DBLP, ArnetMiner, and Scopus.
What is the main objective of the project?
-The main objective of the project is to develop and enhance techniques for author name disambiguation in bibliographic repositories, aiming to improve the accuracy, security, and efficiency of academic searches.
What methodology was used in this project?
-The methodology involved conducting a literature review to understand existing approaches to author name disambiguation and developing a graphical user interface (GUI) to improve the usability of disambiguation tools. A multi-strategy approach combining machine learning, natural language processing, and heuristics was also used.
How does the graphical user interface (GUI) contribute to the disambiguation process?
-The GUI improves user experience by allowing users to easily choose the data source (XML, JSON, or online repositories), input the data, and initiate the disambiguation process. It makes the process more accessible and user-friendly.
What are the two main data input options provided by the GUI?
-The two main data input options are uploading a file in XML or JSON format or retrieving data directly from an online repository such as DBLP.
What is the 'multi-strategy approach' mentioned in the project?
-The multi-strategy approach refers to the integration of various techniques, including machine learning, natural language processing, and heuristics, to improve the accuracy and effectiveness of author name disambiguation.
What is the significance of the development of the GUI in this project?
-The development of the GUI is significant because it makes the previously complex and less user-friendly disambiguation process more practical and accessible for users, thus enhancing the overall functionality of existing author name disambiguation systems.
What are the next steps in this project?
-The next steps include refining the author name disambiguation techniques, validating the system's performance, testing new approaches, and publishing the results in academic conferences and journals.
Outlines
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowBrowse More Related Video
Subjective Answers Evaluation Using Machine Learning
Printed Circuit Board Defect Detection Methods Based on Image Processing, Machine Learning and Deep
Rec 0001
Self-supervised learning and pseudo-labelling
Daftar Pustaka || Menyusun Daftar Pustaka lengkap dengan contoh || Part 1
Machine learning for daily realised volatility prediction - Alexandra Gkolia
5.0 / 5 (0 votes)