DIALECT, un système expert pour la recherche documentaire
Paris 11Disciplines:
Abstract EN:
The aim of the project, with the experimental system DIALECT, is to improve retrieval effectiveness and to suggest some flexible uses for information retrieval systems. This research involves the integration of artificial intelligence (AI) and information retrieval (IR) techniques. The quality of the services usually provided by automatic information retrieval systems has been found inadequate. The particular problem to be solved here by the “expert system” in the environment of information retrieval is the automatic request reformulation: the replacement of the starting entries by a variety of “equivalent” sentence formulations. The request is automatically developed and transformed in order to retrieve additional documents. The system uses rules, meta-rules and linguistic models as meta-rules or another expert level. Starting from a request in natural language, the methods must first retrieve some of the more relevant bibliographical information with necessarily natural language texts. Rules of a linguistic model construct some reformulation rules from these previously selected sentences of texts… So the first document retrieval steps act as the selecting of the candidate reformulation rules. The query reformulation and the search process are highly interactive: they act as rule selecting and rules setting in the “expert system”. The system goes through a cycle: query (re)formulation, retrieval of candidate rules which are explicitly “built” only at this time. So the basic information provided in the “knowledge base”, which gives the inference rules capable of generating new facts from already existing ones, is the document collection itself. Simultaneously a set of relevant documents is proposed. This dynamic process is repeated over and over until it reaches the point where it becomes entitled to stop. This incorporation of linguistic procedures and “query reformulation” knowledge-based expert system in a retrieval setting increases the effectiveness of the system. But, it also leads to added benefits in the form of new and more sophisticated services. The following extensions of the “standard” retrieval service are: (1) the use of natural language front-end allows the user to interact with the system using French in their initial information request. (2) Friendly interfaces are available. We provide the users with a large degree of flexibility in choosing how to interact with the system. The system may operate under several modes: (i) A so called “casual user mode” provides the user with a fully transparent process which decides on its own on any opportunity to improve the request. The user only submits his query in French and lets the system search and display the relevant available information as a list of retrieved documents and/or portions of documents. (ii) The so called “expert assistant-documentalist-mode” allows the trained user to break more frequently into the process, if he wants to improve and control its returns. Such an improvement may consist in redefining some elements of the “semantic classes”, adding or removing propositions, using underlying models of the internal representations such as informative indexes and/or bibliographic information etc. For such a user, the system leads the dialogue: another function of the query reformulation part of this system is then to assist the user in producing complex Boolean descriptions of the required documents. The system is useful for consultation by expert users, but it can also train the unexperienced users… (iii) A so called “specialist mode” provides tracks allowing designeers –linguists, analysts, etc. - to oversee the operating processes and to break in by means of a specialized language in order to “modify the rules”.
Abstract FR:
DIALECT est un « système expert » d’aide à la recherche documentaire. Le système permet à l’utilisateur une formulation de sa requête en français. Il n’est pas nécessaire de disposer de connaissances préalables sur la base. Le système conduit la stratégie de recherche et guide l’utilisateur vers une formulation efficace de sa requête. La question de l’utilisateur, écrite en langage naturel, est analysée puis utilisée pour extraire un premier noyau de notices très pertinentes. Les « zones de texte » de ces notices sont à leur tour analysées et exploitées en vue d’enrichir la question. Le système est muni de composantes linguistiques qui lui permettent d’extraire des « règles de reformulation » intéressantes pour ce type d’application. Le déclenchement des règles et la transformation de la requête permet l’extraction de nouvelles notices bibliographiques. Cette stratégie générale et autonomie est relancée jusqu’à l’obtention d’une condition d’arrêt. Elle est contrôlée et adaptée à des règles heuristiques en fonction du profil de l’utilisateur, du déroulement de la recherche etc. Ce processus conduit, vis-à-vis de chaque classe d’utilisateur, un dialogue différent. On distingue l’utilisateur occasionnel, l’utilisateur professionnel ou documentaliste et les experts intervenant sur les règles du système. Le système permet à l’utilisateur professionnel d’intervenir à des moments précis du fonctionnement pour contrôler et orienter les inférences. Une autre particularité de ce système de recherche documentaire est d’être implanté sur un Système de Gestion de Bases de Données [Fiche ANL 1985]