Interaktivní osnova
[Mikuláš Bankovič & Vítek Novotný] Geographical Information Retrieval in Mapy.cz & a Roadmap towards Machine Intelligence using Complex Systems 22. 4. 2021
[M. Bankovič] Geographical Information Retrieval in Mapy.cz
As the information amount grows, the Information Retrieval (IR) systems are comprehensively studied and researched. However, Geographical Information Retrieval (GIR) systems have little attention. GIR aims to answer geospatial queries such as locating a place on the earth's surface, planning routes, trips, or see real-time traffic. Queries for such a search were traditionally structured (e.g. coordinates); however, in recent years, the amount of unstructured text data is growing rapidly. That is answered by toponym disambiguation, query expansion, and text relevance addition. GIR system consists of multiple parts, shown in Figure 1.1 [2], such as: building a gazetteer, indexing geospatial documents, user interface, query processing, and relevance ranking. In this lecture, I will primarily focus on ranking candidates using machine learning, that was previously selected by the simpler search.
Firstly, I will introduce GIR systems as a whole with a focus on ranking using query context (phrase, latitude, longitude), document context (title, location), and query-document relationship (distance, text similarity). Next, I will discuss production and theoretical differences and limitations. I will prepare queries from users interesting in multiple aspects of GIR systems. I will introduce the production solution at Mapy.cz, with possible alternatives and its advantages and disadvantages. That is followed by a conversation about improvements, different strategies, and multiple small problems in geographical search. I will mention the query auto-completion system as a different system than searching. In the end, I will discuss the state of the GIR research and the goals we want to achieve. I could discuss personalization and user history inclusion if there is a time.
Readings
- HUANG, Jizhou, et al. Personalized prefix embedding for POI auto-completion in the search engine of Baidu Maps. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020. p. 2677-2685. Available at DOI: 10.1145/3394486.3403318
- PURVES, Ross S., et al. Geographic information retrieval: Progress and challenges in spatial search of text. Foundations and Trends in Information Retrieval, 2018, 12.2-3: 164-318. Available at WWW: https://ir.shef.ac.uk/cloughie/papers/purves_et_al_working_paper.pdf
[V. Novotný] A Roadmap towards Machine Intelligence Using Complex Systems
The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more complex interaction with human users. We also present some conjectures on the sort of algorithms the machine should support in order to profitably learn from the environment.
An explanatory model for the emergence of evolvable units must display emerging structures that (1) preserve themselves in time (2) self-reproduce and (3) tolerate a certain amount of variation when reproducing. To tackle this challenge, here we introduce Combinatory Chemistry, an Algorithmic Artificial Chemistry based on a minimalistic computational paradigm named Combinatory Logic. The dynamics of this system comprise very few rules, it is initialized with an elementary tabula rasa state, and features conservation laws replicating natural resource constraints. Our experiments show that a single run of this dynamical system with no external intervention discovers a wide range of emergent patterns. All these structures rely on acquiring basic constituents from the environment and decomposing them in a process that is remarkably similar to biological metabolisms. These patterns include autopoietic structures that maintain their organisation, recursive ones that grow in linear chains or binary-branching trees, and most notably, patterns able to reproduce themselves, duplicating their number at each generation.
Emergent processes in complex systems such as cellular automata can perform computations of increasing complexity, and could possibly lead to artificial evolution. Such a feat would require scaling up current simulation sizes to allow for enough computational capacity. Understanding complex computations happening in cellular automata and other systems capable of emergence poses many challenges, especially in large-scale systems. We propose methods for coarse-graining cellular automata based on frequency analysis of cell states, clustering and autoencoders. These innovative techniques facilitate the discovery of large-scale structure formation and complexity analysis in those systems. They emphasize interesting behaviors in elementary cellular automata while filtering out background patterns. Moreover, our methods reduce large 2D automata to smaller sizes and enable identifying systems that behave interestingly at multiple scales.
In order to develop systems capable of modeling artificial life, we need to identify, which systems can produce complex behavior. We present a novel classification method applicable to any class of deterministic discrete space and time dynamical systems. The method distinguishes between different asymptotic behaviors of a system's average computation time before entering a loop. When applied to elementary cellular automata, we obtain classification results, which correlate very well with Wolfram's manual classification. Further, we use it to classify 2D cellular automata to show that our technique can easily be applied to more complex models of computation. We believe this classification method can help to develop systems, in which complex structures emerge.
Readings
- MIKOLOV, Tomas; JOULIN, Armand; BARONI, Marco. A roadmap towards machine intelligence. In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Cham, 2016. p. 29-61. DOI: 10.1007/978-3-319-75477-2_2
- KRUSZEWSKI, Germán; MIKOLOV, Tomas. Combinatory Chemistry: Towards a Simple Model of Emergent Evolution. In: Artificial Life Conference Proceedings. MIT Press, 2020. p. 411-419. DOI: 10.1162/isal_a_00258
- CISNEROS, Hugo; SIVIC, Josef; MIKOLOV, Tomas. Visualizing computation in large-scale cellular automata. In: Artificial Life Conference Proceedings. MIT Press, 2020. p. 239-247. Available at DOI: 10.1162/isal_a_00277
- HUDCOVÁ, Barbora; MIKOLOV, Tomáš. Classification of Complex Systems Based on Transients. In: Artificial Life Conference Proceedings. MIT Press, 2020. p. 367-375. DOI: 10.1162/isal_a_00260