Open Password – Monday January 31, 2022
#1023
Conference AI-SDV 2021 – Search – Data Analysis – Visualization – Knowledge Processing – Bassam Mokbel – Lucy Antunes – CAS IP Services – Muriel Bourgeois Tassanary – MT-IP Consulting – Intellectual Property Manager – Skill Set – Heiko Wongel – Wongel IP – Interface Projects – Machine Learning Tools in Patent Searching – Rule-based search logic – MI-assisted similarity search – Integrator Smart Search – Human-in-the-loop approach – Transfer learning – EXTRA classifier – Document classification – Information extraction – Holger Keibel – Karakan – Daniela Puccinelli – BERT – Training data – Small Data – Nils Newman – Search Technology – Composite AI – Generative AI approaches – Linus Wretblad – Ipscreener/Uppdragshuset – Black-box character of decision-making – Best practice – Explainable AI – External SaaS offers – Assisted Reading – Automated decision-making processes – User perspective – Spatial Concept Maps – Patent Citation Network Maps – Tony Trippe – Patentinformatics – Software Implementation – Marjorie Hlava – Information Access – Content Management Technologies – Public Library of Science – American Society for Clinical Oncology – Semantic search functions – Automatic tagging
Experian – Intrum Switzerland – DACH Business – BIIA – Credit Rating Data – Marco Kaiser – Science Council – Transformation – Scientific Publishing – Open Access – UB University of Hildesheim – Annette Strauch-Davey – Research Ethics – GO UNITE
I
Experience report AI-SDV 2021:
On the fronts of search, data analysis, visualization and knowledge processing (II) – By Dr. Bassam Mokbel
Experian:
Cooperation with Intrum Switzerland to Grow its DACH Business
III.
Science Council: Recommendations for the OA transformation process
UB University of Hildesheim: Exchange on research ethics
Experience report AI-SDV 2021
On the fronts of search, data analysis, visualization and knowledge processing (II)
By Dr. Bassam Mokbel*
Bassam Mokbel
_____________________________________________________
Combination of rule-based search logic and ML-assisted similarity search.
_____________________________________________________
The second day of the conference began with a presentation by Lucy Antunes (CAS IP Services) and Muriel Bourgeois Tassanary (MT-IP Consulting) entitled «Project Management Challenges for IP Projects». The speakers discussed the various challenges for intellectual property (IP) managers, for example understanding the priorities of their customers and superiors, collecting key information for the respective type of IP and recognizing their own skill set with a view to the regulatory framework and, if necessary to expand. Coordinating these demands varies from project to project. However, software tools and digital knowledge sources, as well as targeted outsourcing, can facilitate or accelerate many processes.
Heiko Wongel (Wongel IP), in collaboration with Interface Projects, asked the question: «Machine learning tools in patent searching – are we on the right track?». The speaker saw the central challenge as being that the growing amount of patent documentation will no longer be manageable with conventional search methods in the near future. ML-based tools are already being incorporated into the research process in various ways to support human users in their searches. However, AI & ML as a standard in patent search has not yet achieved a real breakthrough. He cited weaknesses of ML known from research as possible reasons. This denies the human user an insight into the search logic, making it more difficult to directly influence the process.
Wongel compared the respective advantages and disadvantages of rule-based search logic with ML-assisted similarity search and then described the search with the product «Intergator Smart Search», which cleverly combines the two approaches. In addition, the user can interactively influence the search logic using a graph visualization.
Wongel’s presentation impressed me personally, as human-in-the-loop is a promising way to increase trust in AI. This potential was illustrated with the tool shown.
_____________________________________________________________________
Transfer learning application “EXTRA Classifier” expanded to include document classification and information extraction.
_____________________________________________________
In «Leveraging pre-trained language models for document classification» Holger Keibel (Karakun) and Daniele Puccinelli (University of Applied Sciences of Southern Switzerland) presented the «EXTRA Classifier», a transfer learning application with which the world-famous pre-trained language analysis model BERT has been expanded to include several document classification tasks and the “information extraction” function. This was about practical application on scanned documents, so that font recognition from image data also became part of the processing chain. Promising results were achieved in distinguishing bank documents, invoices and contracts as well as in exclusion classification of other types of documents, although only a small amount of training data was annotated manually. Furthermore, relevant text components of the respective document types, such as the invoice number, were reliably and automatically extracted from invoices.
_____________________________________________________
How we can help ourselves with small data in the face of insufficient training data.
_____________________________________________________
Nils Newman from Search Technology sought to answer the question: “AI – Who is in control and why is that important?”. He first addressed the problem that the increasingly complex and extensive pre-trained models of speech analysis can be unsuitable for specialist knowledge and provide inadequate quality in tools intended for use by expert users. Newman emphasized the importance of sufficient training data. This factor is often underestimated.
The work of subject matter experts is often based on small but highly relevant amounts of data, i.e. on small data instead of big data. These may be too few for training reliable ML systems. Newman mentioned zero-shot, few-shot and transfer learning as possible solutions, but also the combination of several AI processes with language analysis, knowledge modeling and other techniques. Links of this type are commonly referred to as “Composite AI”. Generative AI approaches were also discussed. The speaker concluded by pointing out how important it is to give subject matter experts sufficient control over AI instead of trying to replace their expertise with automation.
_____________________________________________________
Explaining AI to users reduces their skepticism.
_____________________________________________________
In «Best practice on new intelligent tools in IP management and the ethical dilemma of using AI», Linus Wretblad (IPscreener/Uppdragshuset) addressed reliability, transparency and data protection aspects of ML-based systems. The technology encounters limitations in these areas, for example due to the inherent black box nature of the decision calculation. This often leads to skepticism among users. To counteract this, some best practices have now been developed, for example when evaluating ML models. However, these are not widely known or are not yet in use. Furthermore, under the keyword “Explainable AI”, there are extensions to the technology from recent research that can improve transparency. These extensions still have to find their way into practice.
Wretblad took the perspective of a user who, for example, wants to train and apply ML models on their own data using external SaaS offerings. He advised asking yourself questions about the transparency and data protection specifics of the service offering before and during work and also questioning the evaluation processes and quality changes. Finally, he presented strategically promising AI application areas in IP management, for example to counteract the overload of information with “assisted reading” or automated decision-making processes.
I found Wretblad’s explicit adoption of a user perspective to be very successful.
Christoph Haxel, the creator of AI-SDV
_____________________________________________________
Preparation of found sets of documents in “Spatial Concept Maps” and “Patent Citation Network Maps”.
_____________________________________________________
In «The Current State of Machine Learning for Patent Searching and Analytics: Practical Perspectives from ML4Patents.com,» Tony Trippe from Patinformatics gave an overview of current developments in the field of AI-powered patent search and analysis. He listed the necessary work steps when creating patent landscape reports and then discussed the possibilities for support through AI approaches. In addition to improving the search and relevance assessment through semantic text analysis, the focus was on the subsequent processing of the set of documents found. Automatic grouping and categorization can be helpful. These should not only derive the similarity of patent specifications from the textual and categorical descriptions in the document, but also include, for example, assessments by external experts and references to other patents.
Trippe presented two types of visualization with similarities in two-dimensional maps: «Spacial Concept Maps», which resemble a map, and «Patent Citation Network Maps», which show the connections of literature references as a network. A graph or network representation not only promotes visualization, but also promotes downstream analysis, such as the discovery of influential patents as distinct network nodes. With some figures and examples, the speaker referred to the in-house website ML4Patents.com, a general collection of resources on the topic, which contains numerous blog posts, news articles, publications and educational materials.
The conference concluded with the lecture “Semantic Search and Content Management – Case Studies in Successful Software Implementations” by Marjorie Hlava (Information Access). She presented very different case studies in which outdated content management technologies and infrastructures were successfully replaced or modernized in practice. Major institutions such as the Public Library of Science (PLOS) and the American Society for Clinical Oncology (ASCO) were among the cases. In each case, the speaker described the status of the respective software before the conversion, the technical challenges to be tackled and the goals of the modernization, which were determined in close cooperation with the respective customers. Although the solutions were adapted to each customer, there were commonalities in the approach, for example the introduction of semantic search functions (supported by taxonomies, among other things) and the introduction of automatic tagging of keywords in the document inventory. In all cases, the conversion resulted in significant added value for customers.
_____________________________________________________
My conclusion.
_____________________________________________________
Overall, the conference was an enriching experience for me. The lectures looked at many different aspects of the overarching topic and complemented each other very well in terms of content. The solid quality of all presentations also left an extremely positive overall impression.
*Dr. Bassam Mokbel is Chief Data Scientist at Semalytix, a provider of AI-based aggregation of patient statements from social media forums and other text sources. Previously, he conducted research at the “Center for Cognitive Interaction Technology” (CITEC) at Bielefeld University in the areas of machine learning and data visualization methods.
Experian
Cooperation with Intrum Switzerland
to Grow its DACH Business
(BIIA) Experian is expanding its strategic partnership with Intrum Switzerland and thus extending its commitment in the DACH region. Intrum clients in Switzerland now benefit from business information from Germany and Austria, enabling them to tap into additional target groups within the DACH region.
Intrum is broadening its existing offering of credit rating data with German and Austrian consumer data. Companies from Switzerland can now easily obtain meaningful information about their foreign customers through an already existing interface.
«In the DACH region we see great opportunities to become a decisive player with our core business in the long term,» comments Marco Kaiser, Vice President Business Development at Experian DACH. “On the one hand, companies that already have relationships with customers in Germany or Austria will benefit from the new offering. On the other hand, it also opens up attractive opportunities for companies that are planning to expand into the neighboring German-speaking countries but have not yet dared to do so due to possible payment defaults.”
The first e-commerce companies from Switzerland are already accessing the high-quality data via Intrum’s Credit Information data pool. By using reliable and fast processes, they can quickly expand their customer base to the DACH region without increasing their payment default rate. Via the customer-friendly web portal or via a modern programming interface, customers receive all information directly from a single source and can thus focus on their growth strategy.
As a leading provider of credit reports, Intrum offers address and creditworthiness data on virtually all people in Switzerland. Credit scores are updated daily thanks to Intrum’s own collection data and are continuously expanded through additional external data sources. Intrum provides its clients with the best possible risk management and offers solutions for compliance checks and fraud prevention services.
BIIA is the international partner of Open Password
Just published
Science Council: Recommendations
on the OA transformation process
Science Council, Recommendations for the transformation of scientific publishing to Open Access, January 2022, https://www.wissensrat.de/download/2022/9477-22.pdf?__blob=publicationFile&v=12 – The contents:
short version
- Publishing as part of the research process
AI Places of publication and forms of publication A.II Functions of scientific publishing A.III Developments in scientific publication service providers A.IV Development of the open access movement AV Financing models of open access publicationsA.VI Data on publication numbers and publication costs A.VII Systematization of open access – Dimensions of openness A.VIII Licensing
- Aim and subject of the recommendations
- Recommendations
CI products and processes
I.1 Further development of scientific publications in their diversity I.2 Further development of publications as digital objects I.3 Expectations of publication services I.4 Ensuring the quality of the content of articles I.5 Incentives that promote quality
C.II Framework conditions
II.1 Tasks and interaction between actors in the science system II.2 Financial flows and business models II.3 Infrastructure for scientific publishing
Appendix: State of OA transformation
I.1 The open access discourse since the Berlin Declaration 89I.2 Legal framework and practice 98I.3 Tested contract models for OA publication organs
UB University of Hildesheim
Exchange on research ethics
(Annette Strauch.Davey) At the last “GO UNITE!” – Autumn workshop
https://www.go-fair.org/events/go-unite-autumn-workshop/
Several topics were presented for new working groups, including the topic of “research ethics”. The thematic group would like to get together in advance of the next GO UNITE! Exchange at the general meeting in February and then report briefly on it at the general meeting in the spring.
The meeting will take place on Thursday, February 10, 2022 between 1:30 p.m. and 2:30 p.m. and I would like to remind you of this. We meet in the “FAIRes FDM” meeting room: https://bbb.uni-hildesheim.de/b/ann-34u-ft7
Everyone who wants to help with this important aspect of FDM – in different ways – is invited. contexts – for research support at the locations.
Link: https://www.go-fair.org/events/go-unite-working-groups-join-the-discussion-in-german/
OpenPassword
Forum and news
for the information industry
in German-speaking countries
New editions of Open Password appear three times a week.
If you would like to subscribe to the email service free of charge, please register at www.password-online.de.
The current edition of Open Password can be accessed immediately after it appears on the web. www.password-online.de/archiv. This also applies to all previously published editions.
International Cooperation Partner:
Outsell (London)
Business Industry Information Association/BIIA (Hong Kong)
Open Password Archive – Publications
OPEN PASSWORD ARCHIVE
DATA JOURNALISM
Handelsblatt’s Digital Reach



