Tools
Pajek software for analyzing and visualizing large networks
Pajek (Slovenian for “spider”) is a program for analyzing and visualizing large networks, developed since 1996 by the laboratory’s scientific advisor, Vladimir Batagelj, and his colleague at the University of Ljubljana, Andrej Mrvar. The current version of Pajek is free and available for non-commercial use at: http://mrvar.fdv.uni-lj.si/pajek/.
Pajek is a tool for analyzing and visualizing large networks of several thousand or even millions of nodes – collaboration and citation networks, the Internet, dissemination (news, innovation, epidemics), organic molecules in chemistry, protein-receptor interaction networks, genealogy, and also bimodal networks obtained during data mining.
In addition to regular (directed, undirected, mixed) networks, Pajek supports networks with different types of relationships, acyclic networks, bi-modal networks with two disjoint sets of vertices, and time-varying networks. In addition to the basic functionality, Pajek allows you to decompose networks into clusters and show the relationships between them, implements block modeling procedures, analysis of acyclic, bimodal, temporal networks, providing powerful visualization tools and using effective algorithms for analyzing large networks.
The Pajek program is used by laboratory team to carry out applied research. The Pajek program is taught in the master's program “Data Analytics and Social Statistics”, as well as during special schools and master classes.
- More about Pajek: http://mrvar.fdv.uni-lj.si/pajek/
- Laboratory workshops on Pajek: https://github.com/Daria-Maltseva/pajek/wiki/video
Streaming Data Processing Environment
It’s no secret that in the modern world the amount of information is constantly growing, and existing tools and methods do not always allow us to process it efficiently and in a timely manner and extract from a huge array of data what a person/company needs at the moment. This project is designed to solve the problem of large time and labor costs for processing and analyzing information through the introduction of a streaming data processing approach using AI, while the data themselves can be completely different, for example, news feeds, corporate information, any data from open sources on the Internet and others. The approach itself is based on parallel processes for processing incoming data, the scalability of these processes for the required number of tasks, as well as a flexible system for building each individual process.
As an example, we can give a conditional task of searching and analyzing data from the news regarding any event that happened in the world or a specific region. To solve this problem, it is necessary to research a large amount of content and media from different resources (including foreign languages), aggregate the information received, verify it, make an analysis on it, and much more. With manual or semi-manual processing, this entire process will either be sequential and slow, or will require the involvement of more forces and employees (in the case of a company). However, using the stream processing approach, this can be avoided, since in fact, it will allow you to perform several operations almost simultaneously (collection, translation, identifying dependencies, etc.) and at the output a person will receive not a scattered array of data, but structured information ( in this case, processed texts) according to his request, which is much easier to work with and on which it is much easier to build final analytics.
To summarize, we can say that the proposed data processing approach takes over all (or most) of the preparatory work for the analysis of unstructured data, which saves time, labor costs, finances, and also, if the processes are properly structured, reduces the risks of losing important information.
“Bib-eLib” program for analyzing eLibrary data
The “Bib-eLib” program for collecting and processing bibliographic data in Russian from the eLibrary electronic library was developed by the Laboratory members and a student of the master’s program “Data Analytics and Social Statistics” and is intended for collecting and processing bibliographic data in Russian from the eLibrary electronic library. The program is written in the Python programming language.
The program allows you to download an array of data on scientific publications through the API of the eLibrary electronic library, carry out their preliminary processing, solve the problem of disambiguation of publication authors, analyze the final array of data and carry out their visualization and create a network of connections between scientific publications and their authors for processing in the Pajek program .
The Bib-eLib program was developed and tested during a study of the practices of collaboration between Russian sociologists and can be used in the field of scientometric and bibliometric research using data from the eLibrary.
Semantic Brand Score
...
Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!
To be used only for spelling or punctuation mistakes.