Open Password – Friday June 25, 2021
#939
Marianne Englert – FAZ – Digitization – Specialist Group 7 – Marianne Englert Prize – vfm – ISI 2021 – Information Retrieval – Information Competence – Google – Willi Bredemeier – Proceedings – Virtual Events – Dirk Lewandowski – Search Engine Competence – Sebastian Schultheiss – Advertisements – Search Engine Optimization – Search Engine Advertising – Ranking – Libraries – Collaborative information search – Carolin Schulz – Stefanie Elbeshausen – Christa Wormser-Hacker – Saleem Amershi – Meredith Morris – Driver – Observer – Relevance – Sebastian Sünkler – Algorithm – Search engine research – Reputation of sources – Quality of texts – DGI – Virtual DGI- Stammtisch – information transfer in times of distance – Corona – knowledge organization 2021 – indexing – knowledge graphs – DGI online seminars – Christoph Haxel – Artificial Intelligence Conference – Search – Data and Text Mining – Analytics – Visualization
I
Marianne Englert 1926 – 2021
- Cover story:
Information retrieval. Information literacy must primarily consist of competent use of Google
III. DGI
Providing information in times of distance
- Nice
The Artificial Intelligence Conference on Search, Data and Text Mining, Analytics and Visualization
Marianne Englert 1926 – 2021
Marianne Englert has died. The first archive director of the Frankfurter Allgemeine Zeitung, founded in 1949, accompanied and supported the editorial team in their daily work for 42 years (1950 – 1992). She built a powerful archive and set new standards for media archives. She recognized the beginning of digitalization early on as a new opportunity and challenge for the professional sector.
From 1965 to 1989, Ms. Englert was chairwoman of the specialist group of media archivists in the Association of German Archivists (specialist group 7). She later became honorary chairwoman of the vfm (Association for Advanced Training for Media Archivists/Documentarians) and fg7, co-author of the career profile of media archivists/documentarians and head of the “Continuing Education Working Group”. The vfm named its prize after her, with which it has honored students and graduates from the areas of information, documentation, archives and libraries every year since 2012.
A more detailed appreciation of Marianne Englert is in preparation.
ISI 2021:
Information Retrieval
Information literacy must primarily
consist of competent use of Google
By Willi Bredemeier
I was only able to partially watch the “Information Retrieval” session at ISI 2021 on the screen because I was called away in the meantime for personal reasons. I tried to make up for my gaps in knowledge by reading the proceedings. As I read, the question occurred to me: Why is there a need for a virtual event when the proceedings exist? Ah yes, the “discussions” and the “panels”. Nevertheless, this remains a serious question and perhaps an incentive to continually ensure the added value of virtual events in real time.
Almost a Lewandowski Festival:
HAW researcher Dirk Lewandowski
As it turned out in the “Information Retrieval” session, it was a kind of “Lewandowski Festival” as three out of four contributions came from him and his co-authors. Even if one were to combine the third and fourth papers, as would be obvious, the ratio of Lewandowski et al. to other authors would still be 2:1. Is this an indication that there is still greater potential for cooperation to be mobilized in German-speaking information science? The meeting confirmed, representatively for Germany, that search engine users tend to largely overestimate their skills.
__________________________________________________________________________________
Overestimation of one’s own search engine competence is representatively confirmed.
__________________________________________________________________________________
Users typically rate their own search engine competence as high, but are therefore subject to a false conclusion. This was confirmed once again by a representative study by Sebastian Schultheiss and Dirk Lewandowski (“(Un)known actors on the search results page? – A comparison between self-assessed and actual search engine skills of German Internet users”). Here, 84% of those surveyed rated their search engine skills as good, of which 32% rated them as “very good”.
The truth is, as shown in this research, many users don’t understand the difference between ads and organic search engine results. This means that they cannot relativize the results presented for their searches through knowledge of search engine optimization and search engine advertising. On the contrary, a majority sees the search engines as a trustworthy source of information and trusts the rankings of the search engine operators to the extent that they only take note of the first results, if not only the first result at all.
Given that search engines, i.e. Google, have a say in the daily lives of almost all citizens, teaching information literacy should primarily consist of skills in using Google. However, in a survey of 80 libraries about their measures to increase information literacy, key topics other than Internet research were more often addressed, such as library use, databases and even interlibrary loans.
The authors concluded: “Promoting search engine competence is therefore crucial for placing Internet users’ knowledge acquisition via search engines on a sound basis. However, in the context of information literacy, search engine competency is hardly taken into account. … Information literacy training (must) start where users search in their everyday lives, namely on commercial search engines. If they use these systems to identify their skills gaps, they may also be more open to further training, for example on how to select and use special databases.”
___________________________________________________________________________
In the collaborative search for information, old and young people contribute equally to the decisions.
___________________________________________________________________________
How is very close (“collaborative”) interaction in the search for information between old and young people carried out? Carolin Schulz, Stefanie Elbeshausen and Christa Wormser-Hacker sought to answer this question with videos of corresponding searches for travel information and additional open interviews (“Collaborative information search behavior of people from different generations”). The authors were particularly interested in the role behavior of those being observed and whether one of the searchers took on a leadership role in the search process.
Above all, the results seem to confirm a study by Amershi and Morris from 2008 and are not too far removed from our everyday preconceptions: “The young generation is reflected in the role of drivers because they have control over the technical device . The older generation can be classified in the role of observer because they watch and make suggestions. » Accordingly, there are « different forms of collaboration between members of different generations: with regard to familiarity with technology and expertise, the relationship is asymmetrical, However, there is a symmetrical expression in terms of the authority of the collaborators.”
However, the authors also try to prove the relevance of their study: “In the long term, this would make it possible for different generations to work together in a pandemic-resistant manner and benefit from each other.” That would certainly be desirable. However, I cannot understand how this and similar studies can contribute to this.
__________________________________________________________________________________
Is my text on the screen search engine optimized? This can be very likely and less likely.
__________________________________________________________________________________
Search engine research is to a large extent a guessing game, as the ranking of search results is carried out by an algorithm that is only known to the search engine operator. Search engine research is actually a double guessing game in that many companies’ existence depends on their advertising information appearing at the top of search engine results. They must therefore try to follow the algorithm’s assumed preferences by designing their texts accordingly (= Search Engine Optimization/SEO). An industry for SEO has now emerged, which is expected to have reached sales of 80 billion dollars in the USA alone in 2020. According to studies by Sebastian Sünkler and Dirk Lewandowski, 90% of the search results they examined were probably search engine optimized (« Making the influence of search engine optimization measurable – A semi-automated approach to determining optimized results on Google’s search results pages »).
Does this mean that attempts at search engine optimization are usually successful? Not at all if SEO is part of the usual professional design of commercial or commercially driven texts for the Internet, so that outside of private and civil society areas almost only search engine optimized texts would compete with each other for the top places and the probability that a given text is search engine optimized is close would be at 1. The ranking of search engine results is also likely to be determined by factors other than search engine optimization, for example previously unrecognized preferences of the algorithm, decisions of search engine operators and their desire to position their other own offers in a prominent position, as well as the reputation of sources and possibly even the actual one Quality of texts.
Let’s come to the presentation by Sünkler/Lewandowski at ISI2021, which can also be described as a presentation of « research in progress » that goes into detail. In their abstract, the authors present their work as follows (The examples and explanations in brackets refer to their further text): “We use semi-automatic processes and a software framework to study the influence of SEO on search results developed that uses a rule-based classifier to determine the probability of optimization measures on search results. » The approach is based on 20 characteristics, which include the analysis of the following variables, for example: used SEO plug-ins and analytics tools, e.g. Google Analytics – URL- Lists, for example the customers of SEO agencies as well as company websites and websites with advertising – the evaluation of technical indicators such as loading speed (thesis: “If the value is less than three seconds, it is an indicator of optimized content”) – the use certain tags, e.g. for description, as well as a manual classification based on pre-compiled lists of optimized and non-optimized websites. The more these characteristics are present for a document, the more likely it is that this document has been optimized for search engines.
Sünkler/Lewandowski then laid down rules “whether a document is most likely optimized, probably optimized, probably not optimized or most likely not optimized.” Only Wikipedia is in the latter category, “since our approach is still in development.” because it is known from Wikipedia that no active SEO measures are used there.”
The Sünkler/Lewandowski approach was applied to three data sets (on Google trends, right-wing radical content and Corona) with a total of 2,043 search queries and 263,790 results. “The results show that a large proportion of pages found in Google are at least probably optimized, which is consistent with statements from SEO experts who say that it is very difficult to become visible in search engines without applying SEO techniques become. »
If we follow Sünkler/Lewandowski, we can now determine which search engine results are likely or less likely to have been optimized. If you are interested in the pragmatic use of research results, this can only be an intermediate step. Do the results also say something about “whether the influence of search engine optimization has a positive or negative effect on the quality of results”? The authors: “For such analyses, the methodology would have to be determined, for example, through retrieval studies in which jurors evaluate the quality of the search results.”
__________________________________________________________________________________
A tool for determining the likelihood of search engine optimization on a website.
__________________________________________________________________________________
As part of their study presented above, Sebastian Sünkler and Dirk Lewandowski developed an SEO tool that determines “the likelihood of search engine optimization on a website”. A demo of the tool is available at http://5.189.20:5000 .
DGI events
Providing information in times of distance
Virtual DGI regulars’ table, July 13, 6 p.m. – 8 p.m., online
DGI Forum 2021 – Providing information in times of distance, October 28th and 29th, online
– immediately following: CfP – Knowledge Organization 2021, Digitization and knowledge organization: between indexing and knowledge graphs?
DGI online seminars summer 2021:
Writing Workshop – Instagram :30. June, 9:30 a.m. – 1:00 p.m., online, M. Borchardt
Writing workshop – Facebook : June 1st, 9:30 a.m. – 1:00 p.m., online, M. Borchardt
Social media and public relations I : July 7th, 9:30 a.m. – 1:00 p.m., online, P. Landes
Social Media and Public Relations II : July 8th, 9:30 a.m. – 1:00 p.m., online, P. Landes
Social media and research : July 12, 9:30 a.m. – 1:00 p.m., online, Chr. Rahner-Göhring
Digital photography for social media and websites I : July 13th, 9:30 a.m. – 1:00 p.m., online, M. Borchardt
Digital photography for social media and websites II ; July 14, 9:30 a.m.-1 p.m., online, M. Borchardt
Writing workshop – websites : September 6th, 9:30 a.m. – 1:00 p.m., online, M. Borchardt
Writing workshop – blogs : September 7th, 9:30 a.m. – 1:00 p.m., online, M. Borchardt
Christoph Haxel
The Artificial Intelligence Conference on Search,
Data and Text Mining, Analytics and Visualization
Dear Colleagues
The Preliminary Program is published and looks excellent.
There are only 3 free speaking slots available for speakers and 3 free places available for exhibitors. The Super Early Bird Registration is open … book now and save money.
I keep my fingers crossed and I am looking forward to seeing you in nice nice or meeting you with Zoom.
With kind regards Christoph Haxel
The Programs:
Monday October 4th, 2021
Conference starts at 09:00 Opening by Christoph Haxel (Dr. Haxel CEM, Germany) Ping Pong – Playful Knowledge Transfer
Vanessa Lage-Rupprecht (Fraunhofer Institute for Algorithms and Scientific Computing SCAI, DE), Marc Jacobs (Fraunhofer Institute for Algorithms and Scientific Computing SCAI, DE)
AILANI for clinical competitive landscaping
Angela Bell (Biomax Informatics, Germany)
New Product Introductions: Biomax INFORMATICS // DEEP SEARCH 9 // CENTREDOC Exhibition and Networking Break Integrated Artificial Intelligence – A Factory Progress Report
Harald Jenny (CENTREDOC, Switzerland)
Srinivasan Parthiban (VINGYANI, India)
New Product Introductions: Lighthouse IP // Search Technology / VantagePoint Lunch, Exhibition and Networking Synonym and AI
Jay Ven Eman (CEO, Access Innovations, USA)
New Product Introductions: Dolcera Exhibition and Networking Break The secret of successful CI: precise targeting + immediate discovery
Klaus Kater (Deep SEARCH 9, Germany)
Tuesday October 5th, 2021
Conference starts at 09:00 Semantic Search and Content Management – Case Studies in Successful Software Implementations
Marjorie Hlava (Information Access, USA)
Project Management Challenges for IP Projects
Lucy Antunes (CAS IP Services, USA), Muriel Bourgeois Tassanary (MT-IP Consulting, FR)
Exhibition and Networking Break Machine learning tools in patent searching – are we on the right track?
Heiko Wongel (Wongel IP)
Leveraging pre-trained language models for document classification
Holger Keibel (Karakun, Switzerland), Elisabeth Maier (Karakun, Switzerland)
Lunch AI – Who is in control and why is that important?
Nils Newman (Search Technology, USA)
Exhibition and Networking Break News from Pateninformatics – Title is coming soon.
Tony Trippe (Patinformatics, USA)
Closing Remarks – Christoph Haxel
OpenPassword
Forum and news
for the information industry
in German-speaking countries
New editions of Open Password appear four times a week.
If you would like to subscribe to the email service free of charge, please register at www.password-online.de.
The current edition of Open Password can be accessed immediately after it appears on the web. www.password-online.de/archiv. This also applies to all previously published editions.
International Cooperation Partner:
Outsell (London)
Business Industry Information Association/BIIA (Hong Kong)
Open Password Archive – Publications
OPEN PASSWORD ARCHIVE
DATA JOURNALISM
Handelsblatt’s Digital Reach



