Open Password – Monday April 12, 2021
#908
vfm spring conference – Media documentation – Corona – Agile organization – New roles and workflows – Mario Müller – Seven One.Production – Virtual conference – Face-to-face event – Technical innovations – Corporate strategies – Customer behavior – Training focuses – Technical qualification – Artificial intelligence – Music documentation – Gema administration – Open Password – Going Green – Central Banks – G20 Countries – China – Brazil – France – Germany – Positive Money – Climate Change – Ecological Breakdowns – Research and Advocacy – Fossil Fuel Reduction – Market Neutrality – European Union – Saudi Arabia – Chinese Central Bank – Clean Coal – Jair Bolsonaro –
Machine learning – Image similarity search – Artificial intelligence – Historical library holdings – Klaus Kempf – Markus Brantl – Thomas Meiers – Thomas Wolf – Extraction of visual features – Descriptors – Support vector machine – Efficient parallel search – Index file – Bavarian State Library – Query by Example – Accompanying indexing based on textual metadata
1.
Interview with Mario Müller, Chairman of vfm: Media documentation in the midst of major upheavals
2.
Outside the box (35): Going Green? The central banks of the G20 countries
are good on paper, not so much in action
Cover story: Machine learning – In search of the hidden image – Artificial intelligence opens up historical library holdings
vfm spring conference
Media documentation in the midst of major upheavals:
the challenges of “Corona”, “Agile Organization” and “New Roles and Workflows”
Mario Müller, chairman of the vfm (lecture 2017)
Mario Müller from Seven.One Production is chairman of the vfm – Association for Media Information and Media Documentation. Open Password spoke to him in the run-up to the vfm spring conference, one of the few events in the information industry that has lasted over the decades and has continually renewed its content. The program of the spring conference “Great Freedom or Quarantine – Agile Media Documentation in Times of Corona” with the sessions “Agile Organization and Development”, “News from Universities” (including newcomer forum), “Technical Innovations”, “Music and Documentation” and “New Roles and Workflows” was published in Open Password on March 24th. The meeting will take place from Monday, April 26th to Wednesday, April 28th.
__________________________________________________________________________________
In view of the current upheavals, the vfm spring conference seems more important than ever.
__________________________________________________________________________________
Virtual conference instead of face-to-face event, still an adventure. What did you do to make the vfm spring conference a success? Do you also see advantages of a virtual conference compared to an in-person event? Many of our colleagues prefer a face-to-face event because the personal professional exchange with colleagues from other media companies and broadcasters is an important motive for attending our spring conference. In view of the pandemic, we only had the choice of canceling the conference like last year or making it virtual. We chose the latter. The technical and organizational developments in our industry are so rapid that it is important to exchange ideas about the most important topics and technical innovations as well as to highlight the latest developments at a central event like this conference.
A virtual conference definitely has its advantages. Travel costs are eliminated and some undecided participants register in order to only attend the event blocks that interest them. Then, as a non-profit organization, we do not want to make any profits from this conference. Where else can you get a three-day training event for less than 200 euros? Of course, we hope that the event will also be accepted as a virtual one and will be a success. We are breaking new ground, even though we have brought professional support on board. But if you don’t dare, you don’t win.
Regarding the contents of the conference, as you can see from your program (with the exception of session 4): Why did you and your colleagues choose these particular areas of focus? To what extent are there particular risks and opportunities for media documentarians here? The conference program is put together by our program committee, which is made up of managers from several media companies, publishers and broadcasters. If our departments want to continue to exist and actively shape their future, several factors are particularly important. First and foremost, we need to know the business strategies and user behavior of our consumers. This allows us to predict future developments and determine what contribution we can make with our specialist skills and how we need to prepare ourselves for this.
This requires a willingness to change and react flexibly. We also have to regularly adapt the training focuses to the changed requirements. More technological skills will be required in the future. Because we have to deal with the flood of media content that needs to be edited and distributed. This is only possible with the help of new technologies. Artificial intelligence is already finding its way into our everyday work. We want to inform and exchange information about this. Then cross-media thinking is crucial. For example, long-established newspaper publishers can no longer be successful if they do not place photos and videos alongside their text messages. Constant changes also require a rethink in the organization. Other and new forms of work are being tried out. This is an issue that has come to the fore more due to the pandemic.
In short, I find what is currently happening in our environment to be so exciting that I can only recommend that all colleagues take part in the conference. The risk is to do nothing and carry on as before. And I don’t have to tell anyone what the consequences are.
__________________________________________________________________________________
Anyone who holistically combines administration, research and Gema administration in music documentation will grow.
__________________________________________________________________________________
Session on “Music and Documentation”. That’s a nice specialization you’ve chosen. We have repeatedly focused on content at previous spring conferences, for example on the topics of sports or tabloid documentation. Music is very important, especially when it comes to video and audio exploitation. Without them nothing works. It is therefore time to bring this topic back into focus. The responsible music departments in our media companies are set up very differently. The importance of those who have holistically linked administration, research and Gema administration and who also use new technologies will continue to grow and gain influence. I think that the speakers on this topic will give us a lot of ideas that we can learn from.
Who participates? How many are currently taking part? What do you expect from this conference? The participants from Germany, Austria and Switzerland usually work in broadcasters, publishing houses, private media companies and municipal institutions.
We expect exciting lectures, a lively exchange between the participants and lots of ideas that we can take with us into our working world after the conference. Of course, we are excited to see how the modified conference concept will be received in this virtual form. We hope for a lot of feedback and suggestions for the future.
Open Password will report on the vfm spring conference “Great Freedom or Quarantine – Agile Media Documentation in Times of Corona”.
Outside the box (36)
Going Green? The central banks of the G20 countries
are good on paper, not so much in action
China leads the ranking ahead of Brazil
and France, with Germany in 7th place
The Green Central Banking Scorecard – How Green Are G20 Central Banks and Financial Supervisors?, in: http://positivemoney.org/wp-content/uploads/2021/03/Positive-Money-Green-Central-Banking-Scorecard-Report -31-Mar-2021-Single-Pages.pdf . Central banks are obliged to incorporate environmental considerations into their policies if they are to fulfill their mandate and prevent the dangers of climate change and ecological collapse. How “green” are the central banks in the 620 countries really? This was determined in the areas of research and advocacy, monetary policy, financial policy and “leadership by example” through literature studies and discussions with experts and central bank representatives, assessed using a school grading system (from A – F) and ranked. While 13 out of 20 countries received full marks in the area of “Research and Advocacy”, there was widespread inaction in the area of “reducing fossil fuels” in all countries. The authors say: “While further research and advocacy is generally positive, this will not earn additional points for the majority of the institutions assessed here.”
The authors further: “Crucially, this report highlights that encouraging the growth of more green activity is no substitute for institutions down their support of all the fossil fuel intensive and ecologically harmful aspects of our financial system. Furthermore, our report shows primary concerns expressed by central banks and commentators about the prospect of “going green” – that it would threaten the independence of central banks invalidate their so-called “market neutrality” serves as little more than a facade to paper over the inherently political nature of policy decisions made by central banks , and current approaches focused on climate-related disclosures and stress tests are insufficient”.
As the ranking in the table above shows, the grades “very good” and “good” were not awarded at all. Only China achieved “fully satisfactory” with 50 out of 130 maximum points, followed by Brazil and France with a respective rating of “3 minus”. Nine countries received a grade of “sufficient,” including the European Union in fifth place with 33 points and Germany in seventh place with 29 points. The bottom performers, each with a grade of “unsatisfactory”, were India, Russia, South Africa, Turkey, Argentina and Saudi Arabia.
The Chinese central bank came out with its first “green initiative” in 1995, the “Notice on Implementing Credit Policies and Enhancing Environmental Protection” with guidelines for banks on “how to better include environmental variables in credit decisions”. The country probably owes its top ranking among G20 countries to its close coordination with other government bodies. However, the authors consider the Chinese concept of “clean coal” to be a dangerous myth. In 2008, the Brazilian government required banks to conduct credit checks to determine whether investors in the Amazon were complying with environmental regulations. Currently: “The BCB must ensure that its policies are effectively implemented on the ground. “In the context of the current Bolsonaro government, the prospects for social progress and environmental policy in Brazil are weak.”
Machine learning
Looking for the hidden picture
Artificial intelligence opens up
historical library holdings
Visual feature extraction, efficient parallel search, flanking indexing
based on textual metadata
By Klaus Kempf, Markus Brantl, Thomas Meiers and Thomas Wolf
Second part
Visual feature extraction. The basis of the image similarity search is the different visual properties of an image, its specific color and edge information, which must first be appropriately captured and coded. So-called descriptors are used. The visual information of an image is stored in a very compressed form. In our case, the descriptor belonging to an image has a size of only 96 bytes. The visual descriptor encodes both the color properties and the specific edge features.
To capture the color information, the image is broken down into 8×8 uniform areas. The average gray value (Y value) and the color values Cb and Cr are determined for each area. In this way you get three blocks with 8×8 values, each of which is subjected to a two-dimensional cosine transformation. The coefficients obtained in this way are sorted based on frequency and the first 15 (gray value Y) or ten coefficients (color values Cb and Cr) are adopted as the value for the descriptor.
Figure 2 : Edge histograms for the individual areas. At the top left is the original image, at the right is the edge image, below which are edge histograms for two image areas. The abscissa indicates the different edge orientations.
To display the edge information, an edge filter is first applied to the gray value image. You get an edge vector for each pixel. If there is no edge, the edge vector is a zero vector. The edges are divided into three classes depending on their length:
- contourless surfaces (edge vectors very small, shown as a circle on the abscissa of the histogram in Figure 2);
- Textures (edge vectors have a mean value, gray symbol in the abscissa in Figure 2);
- real edges (edge vectors have a high value, black symbol in the abscissa in Figure 2).
Textures have two directions (vertical and horizontal), and real edges have four directions (vertical, horizontal and the two diagonal directions). The image is divided into 16 areas (four each in the vertical and horizontal directions). The frequency of the different edges (classes such as directions) is determined for each area. You get 16 edge histograms with seven values each.
The values from the color layout and the frequencies of the edges from all 16 areas are combined into a vector. An L² norm is used as the distance measure, which is mapped onto a similarity scale from 0.0 to 1.0 using an exp(-x) function. The proportion of color to edge information can be weighted using a factor.
The length of the descriptor is 96 bytes. This includes the fact that when the individual values are stored, they are quantized to four to eight bits per value. The small size of the descriptor means that hardly any memory space is required and the distance calculations can be carried out very quickly (several million distance calculations per CPU core per second). Without reducing the image information to descriptors, a search across such an extensive inventory (more than 54 million images) would not be possible in real time.
Sorting out irrelevant images using machine learning methods. During the course of the preparations, it quickly became apparent that the images identified contained a very large proportion of motifs without any informational value, which distorted any search results. These are e.g. E.g. single-colored areas, which often occur in books (margins, empty pages), book covers, but also stains and tears and the omnipresent ownership stamps. In addition, there are content elements with no informational value as an image. Tables and music notes are particularly worth mentioning here. Such images initially accounted for well over twenty percent. In a subsequent analysis, these images were filtered out using methods from the field of machine learning.
For this purpose, classes were first put together using example images. A one-class support vector machine (SVM) with a Gaussian function as a kernel was then trained for each class with the descriptors of the example images. A total of eleven classes were trained. The system can be expanded and adapted to new classes at any time.
To assess whether an image should be kept or discarded, the descriptor is fed into all SVMs one after the other. If an SVM recognizes that an image belongs to its class, it is sorted out. The classification is done very quickly because the number of support vectors per SVM is in the double-digit range. With eleven SVMs, distance calculations in the order of around a thousand calculations are necessary, which are in the range of around one millisecond of computing time
.
Efficient parallel search. In order to make the contents of the digital inventory available for an image search, the inventory must first be fully indexed. Indexing is a one-time process that is very time and computationally intensive. It takes around two weeks to fully index the BSB inventory, with the task being distributed across a cluster of over thirty CPUs used in parallel. The result is an index of all files, which contains the relevant search information. If new digitized books are added, only these need to be indexed; the existing index will be expanded accordingly. Removing books from the index occurs according to the reverse principle.
For each image, the index file contains the descriptor already mentioned as well as other important information: the ID of the work from which the image comes and the page number on which it is located. Additionally, exact pixel coordinates define the position of the image on the page.
For the application, the index file is loaded into the main memory when the server starts and can therefore be accessed very quickly. If there is a search query, the descriptor is determined. If the image of the search query is in the inventory, the descriptor can be determined directly from the index data. The descriptor of the search query is compared with all descriptors in the inventory. The k best results are output. The number k of the best results can be set by the user.
In addition, a distance function is required that specifies the distance between two descriptors. This function is intended to use the descriptors to represent the visual difference between two images as optimally as possible. The similarity function is calculated from the distance function, which outputs a value between 0.0 and 1.0. The value 0.0 means maximum dissimilarity, the value 1.0 means maximum similarity or identity.
The first version was available for public use in April 2013. At that time, four million image segments were already available, which were obtained from the evaluation of 60,000 digitized works. In the following years this number was increased to six million image segments from 80,000 volumes. A new phase began in 2016. The latest version of the software captures all BSB digital images and now offers 54 million image segments for searching.
“Query by example” is used as the search method, whereby a search image is specified for which the visually most similar images in the inventory are searched for. Visual search is often used as an iterative process in which the user gradually approaches the images they are looking for.
The similarity search was implemented as a client–server system. The user starts the image search via a web client. The request is processed by a server, which returns the result to the web client.
__________________________________________________________________________________
Accompanying indexing based on textual metadata
__________________________________________________________________________________
Regardless of the challenge of indexing the image collection, the question arises as to how the collection can be made accessible to the user. While with a text-based search you can quickly get results by simply entering them, the situation is different with an image database. So it was important to create an entry point for the user. This was done through a two-step categorization.
In a first step, all existing text-based structural metadata at page level was evaluated and automatically compared with a list of several hundred key terms. In this way, for example, all pages with the key terms “portrait” or “portrait” were assigned to the “people” category. Terms such as “interior view” and “side nave” could be assigned relatively clearly to the topic of “architecture”.
The end result was the categorization of several thousand images into the categories “People”, “Architecture”, “Plants”, “Animals”, “Coat of Arms” and a few others.
In the second step, similarity searches were triggered based on the previously categorized images; The images found with this could largely be assigned to the same category. The categorization was carried out here as a manual process in order to increase the categorized inventory.
Independent of this measure, the text-based structural metadata was prepared for a text search. This means that the visitor should be able to search for “portrait” or other terms and names and find what they are looking for.
Read in the final episode: The application with the upload option by the user – Limits of image search – Conclusion and perspectives
OpenPassword
Forum and news
for the information industry
in German-speaking countries
New editions of Open Password appear four times a week.
If you would like to subscribe to the email service free of charge, please register at www.password-online.de.
The current edition of Open Password can be accessed immediately after it appears on the web. www.password-online.de/archiv. This also applies to all previously published editions.
International Cooperation Partner:
Outsell (London)
Business Industry Information Association/BIIA (Hong Kong)
Open Password Archive – Publications
OPEN PASSWORD ARCHIVE
DATA JOURNALISM
Handelsblatt’s Digital Reach