Introduction/Background Information
Duquesne University is a private, Catholic, co-educational university in Pittsburgh, Pennsylvania. Its total enrollment in 2023 was 8,179 students. Gumberg Library serves all University undergraduate and graduate students and faculty.
After several major weeding projects and a new electronic-preferred acquisitions policy, the physical collection was reduced to 50 percent fewer items than it had just twelve years ago. During that same time, electronic resources increased five-fold. The current collection is approximately 78 percent electronic and 22 percent physical items.
As the collections changed, so did staffing. In 2012, when the collections contained over 800,000 physical items and just over 200,000 electronic titles, technical services had twelve full-time and two part-time positions. Today, there are just over 300,000 physical items and over 1,000,000 electronic resources handled by seven full-time and two part-time employees. Adding fewer physical items each year required fewer staff to catalog and process items; however, electronic resources also require maintenance.
Collecting E-Resource Usage Statistics
As resources changed from a mostly physical collection to a predominantly electronic collection, it was important to address who should collect e-resource usage and how they should go about it.
As electronic resources were added, librarian positions were adapted to take on new responsibilities. The serials librarian, who was responsible for the print and microfilm serials, became responsible for the electronic resources as well, as a large percentage of electronic resources were serials. Soon, that position changed responsibilities to focus on electronic resources exclusively. In 2018 the position was renamed access & discovery librarian. The incumbent’s main responsibilities were electronic resources management tasks such as uploading e-book records into the discovery layer, turning titles off or on in the e-resources holdings management utility (in this case EBSCOadmin), troubleshooting discovery layer problems, and collecting e-resources statistics.
In 2022, the access & discovery librarian left for another job and the position was eliminated due to continued budget constraints. The head of licensing & acquisitions and the cataloging & metadata librarian assumed those responsibilities. With a thorough knowledge of the acquisitions process, the head of licensing & acquisitions took over uploading the e-book records into the discovery layer, the title activation and deactivation process, and keeping the public-facing list of databases up to date. The cataloging & metadata librarian, who had been assisting with usage data collection, took over all e-resources statistics collection and reporting tasks.
The reasons for elaborating on the evolution of this serials librarian position are two-fold. First was to illustrate how the authors’ library ended up assigning e-resource usage collection to the cataloging & metadata librarian. Each library seems to have a unique set of circumstances to determine which position should deal with usage statistics and this explained their situation. The second reason was to emphasize that a full-time position in which the librarian spent a large share of their time dealing with e-resource statistics had been eliminated. Now, with a collection of over 1,000,000 e-resource titles, that task had been assigned to a librarian who already had a full workload. Because of reduced staffing, they needed to find a way to collect and report e-resource statistics as efficiently as possible.
It had been decided who should be responsible for collecting statistics, but how did they learn what they needed to know? When the cataloging & metadata librarian first began assisting to collect usage statistics, she did not know much about it. There was no documented workflow or process for collecting usage statistics to reference. The Project Counter website and YouTube videos proved helpful for learning about COUNTER metrics, and books and professional articles gave information about e-resource management in general.1 Specific information on how to collect usage statistics proved difficult to find. The librarian reached out to a group on Facebook called Troublesome Catalogers and Magical Metadata Fairies for guidance on how to learn about e-resource usage collection and received responses such as “From this day forth, you will forever bang your head against the wall,” and “I’m so sorry to have to be the one to inform you of this … but no. You have now fallen down the rabbit hole of Usage Statistics,” and “We just used the massive spreadsheet method and converted the COUNTER and non-COUNTER to one uniform sheet to be able to compare.”
The cataloging & metadata librarian next investigated how other libraries gathered and processed usage statistics, and in doing so, discovered that some library service platforms included their own tools for collecting COUNTER and non-COUNTER data. Alma, WorldShare, and FOLIO hosted by EBSCO were listed as examples, but these systems are usually add-on modules with extra costs. CORAL was given as an example of an open-source solution. Many libraries used multiple systems and tools such as “the giant spreadsheet method” and/or LibInsight, a SpringShare library analytics tool. It was determined that no single system was used, or even preferred. There did not seem to be one place to go to learn what was needed. Everyone appeared to be learning on the job and coming up with the best solutions for their library’s particular situation.
When the cataloging & metadata librarian first started to collect usage data, Gumberg librarians were using the massive spreadsheet method. They pulled usage data from each vendor’s website and entered numbers into a shared Excel spreadsheet. This was very labor intensive and only updated once a year, therefore very inefficient. Without access to an integrated e-resource management system, they investigated other sources. Gumberg Library staff used LibInsight for statistics such as gate counts and course reserves. Collection of the COUNTER usage data began after using the online tutorials provided in LibInsight.2 Adding all vendors to a single report and Standardized Usage Statistics Harvesting Initiative (SUSHI) permitted automatic uploading of the COUNTER usage data. The process resulted in the creation of a quarterly report. It was an improvement, but it was still a spreadsheet that was not easy to find and cell numbers were typed in manually. During this time, the Systems and Scholarly Communications Department had success presenting other library statistics using Power BI dashboards.3 The goal was to find a way to collect usage information monthly using Power BI so that e-resource usage could be presented with all of the other library data.
Power BI Dashboards for Library Data
Dashboards can help libraries establish a culture of data by quantifying and presenting in meaningful ways the activities that happen every day within the library. Dashboards reach across library departments so that the activities of one department (or functional area) are known to other functional areas. Dashboards allow library employees to ask critical questions, such as: how does my activity relate to the activities of other units; what is the duplication of activity from one unit to another; and how can departments collaborate with each other to bring meaningful improvements and efficiencies to library activities?
As a Microsoft campus, Power BI could be licensed as part of that institutional subscription. The same may not be the case for other institutions who may have access to another product such as Tableau or Google Data Studio. Leveraging an institution’s license saves a lot of steps and a lot of work. Relying on the institution’s license also addresses security and storage concerns that the library would otherwise have to address on its own. Power BI, Tableau, and Google Data Studio are different from the dashboards that exist with systems such as in Alma or in EBSCO’s Panorama because the dashboards can easily be made public and are not dependent on system-specific credentials and permission layers to update. The enterprise systems are no more complex than updating a Microsoft Word document and connecting the data to it. Permissions can still be set though as to who may publish the dashboard, and who may access the area where the data is stored, but those permissions can be set at the institutional level instead of at the individual library system level.
Using a standard tool like Power BI is also a transferable skill that can lead to greater employee interest in participating in the analytics process. Apart from the domain knowledge, there is no difference in creating a dashboard for a library than there would be for creating a Power BI dashboard for a restaurant, accounting firm, or other business. The opportunity may be opened to data science or data analytics graduate student assistantships on campus which can build knowledge at the library and provide invaluable experience with real-world data for graduate students. After securing an institutional license and installing the analytics software, it is advisable to start with a simple dataset in a single Excel workbook or Google Sheet. Starting small in this way provides the opportunity to play with the visuals that are available in the dashboard software without investing too much time. Keep in mind that the larger the dataset, the longer it will take for a dashboard to open, and this can create delays in the learning process. Populating a test spreadsheet with a small amount of data to use as a learning tool may provide a sufficient dataset for experimenting with the possibilities of a dashboard.
At Duquesne, a rationale for each dashboard served as our starting point for their creation: why we are creating the dashboard, who will sustain it, where will the data be stored, and when will the dashboard be updated? The next step was writing documentation. Each step, from harvesting the data to populating the dashboard to refreshing the dashboard, proved critical to making the dashboard effort work. Gathering the data was the next step. Duquesne made a conscious decision to not connect our dashboards directly to our systems. To minimize any risk of Power BI as the source of a data breach, no direct links were made between the Power BI dashboards and the underlying databases by way of SQL or API queries. This “air gap” between the systems and the dashboards protects against any data being compromised but does add an extra step in first needing to collect the data before populating the dashboards. Some of the data used to populate the dashboards was not gathered from a database. An example is the library’s headcount data. Students walk the library each hour to count how many people are in the building. For the COUNTER dashboards however, the SUSHI credentials were entered into LibAnalytics, and harvesting of the data from LibAnalytics each month populated the Power BI dashboards.
The final step is to update the dashboards. A conservative approach to manually update each dashboard every month ensured that someone was checking the data to make sure the system is updating as it should. After refreshing and checking the dashboard, the person responsible signs off on the dashboard tracking document, which is a spreadsheet in a Microsoft Teams channel, but really could be a sheet or Word-type document stored in any environment. The dashboard updates occur between the first and tenth of each month, but most months it only takes two to three days to complete the update.
E-resource usage was a critical piece because most of the library’s material budget, and nearly all of its serials budget, is e-resources.4 5 Before the e-resources dashboard, the only information available in the dashboard environment was usage logs from EZproxy, which does not correlate with COUNTER usage. The dashboards allowed data sharing with other constituent groups such as the University Library Committee and our faculty partners who serve as liaisons to their schools. Even for individual instruction sessions, it became possible to demonstrate how usage improved or did not improve when demonstrating resources to courses.
Previously, to do any of those things, contacting our technical services department directly for information was the only approach. Because the previous processes that were in place were so time-consuming and required so much labor, the information was only being collected quarterly instead of monthly. Now the dashboards are the reference point for finding information with no special intervention needed.
Power BI Dashboard for E-Resource Usage
There was a dashboard showing circulation transactions for the physical collection, but that accounted for only 22 percent of the collection. The goal was to create a dashboard showing e-resource COUNTER-5 usage to represent usage for the other 78 percent of the library’s titles, ideally one that could be updated monthly and presented with other library data. An early idea was to download reports directly from the vendors, but the reports were too different to work with each other easily. Going back to LibInsight and creating a dataset that included all COUNTER-5 compliant vendors was the key. Some vendors were easy to set up; others needed much back-and-forth communication with customer service representatives to get things working properly. Fetching SUSHI data from previous years and setup of future automatic harvesting of SUSHI data commenced following the configuration of all vendors.
The presenters demonstrated how to create the monthly spreadsheets that can be used by Power BI. In LibInsight, once the SUSHI has been harvested for each vendor, the analyze function is used to create a dataset for the month. By typing in the first and last dates of the month and clicking the go button, LibInsight will show all of the COUNTER metrics for every platform in the dataset. Clicking the CSV button downloads the usage information into one spreadsheet. Making the spreadsheet work in Power BI requires some edits. First, all of the sheets used in a dashboard have to have the same title; in this case the sheets are given the name Platforms Summary. Power BI also needs a date, so a column is added and labeled Date. This column is then filled in with the month and the number type changed from General to Date. Remove the TOTAL row.
The presenters displayed a completed spreadsheet that works with Power BI and stated that even without LibInsight, if monthly usage statistics can be put into a similar spreadsheet, it will work with Power BI. All the columns have the same headings in the same order, all auto-sum functions are gone, each sheet has the same name, and the date column has been added.
The completed spreadsheets are kept in a SharePoint file. To get started, there were only a few months of spreadsheets. Once the dashboard was created and running correctly, spreadsheets for every month going back to January 2019 were added. The presenters then showed Power BI and how the information in the spreadsheets was used to create the dashboard. After the “Get data” command retrieved the information stored in the SharePoint file, various visualizations highlighted the data. A Total Item Requests section utilized the card visualization type. Slicer visualizations added the ability to select by specific date and vendors, while line graphs showed usage over time, and a pie graph showed usage by vendor. A demonstration of the Power BI dashboard showed the following: how to narrow dates by specific ranges, how to highlight one or more vendors for between-vendor comparisons, and how to visualize other information stored within the spreadsheet.
Using the template created for other dashboards by the Digital and Scholarly Resources Department, procedure documents were created to keep the process up and running. The document answers questions such as: Why was the dashboard created? How often will it be updated? Who will sustain the dashboard? Where are the data and documentation located? It also includes step-by-step procedures of how to collect the monthly usage and upload the information into the Power BI dashboard.
Improvements and Adjustments
After the usage dashboard was up and running, a few upgrades and adjustments were made. The first was to add more years to the dataset. The dashboard was changed from showing eighteen months of usage to one that showed usage for five years. This was accomplished by adding spreadsheets for past months into the SharePoint file. Another suggestion was to provide a visualization that showed how much usage was for e-books vs. e-journals vs. databases. This was done by adding three more columns to the spreadsheets where this information could be pulled in Power BI.
The presentation ended with a section on lessons learned. Lesson one was to start small. It is easier to adjust a set of six spreadsheets than to go back and fix sixty of them. Another was to err on the side of more, rather than less information in the spreadsheets. Leave information in the spreadsheets, even if you do not need it at the moment. If it is needed later, the data will already be there ready to use.
Questions and Wrap-Up
Many of the attendants at the presentation were other librarians who were having similar problems collecting and presenting e-resource usage. Questions ranged from specifics about Power BI licensing, to the particular steps used in LibInsight. Some attendants had access to LibInsight but were not using it for COUNTER-5 and had questions about how to get started. One attendee commented that using Power BI instead of a library-specific data collection tool might help administrators locate library statistics more easily.
Contributor Notes
Kathaleen McCormick is the Cataloging & Metadata Librarian at Gumberg Library, Duquesne University, Pittsburgh, Pennsylvania.
Jennifer Ye Moon-Chung is Assessment Coordinator at the University of Pittsburgh Library System, Pittsburgh, Pennsylvania.
Rob Behary is the Systems Librarian at Gumberg Library, Duquesne University, Pittsburgh, Pennsylvania.
Notes
- “Education,” COUNTER Metrics, accessed July 17, 2024, https://www.countermetrics.org/education/. ⮭
- Talia, “LibInsight with COUNTER R5: Ahead of the E-Data Curve,” The Springy Share (blog), Springshare, March 13, 2019, https://blog.springshare.com/2019/03/13/libinsight-with-counter-r5-ahead-of-the-e-data-curve/. ⮭
- “Gumberg Library Analytics,” Gumberg Library, Duquesne University, accessed May 31, 2024, https://guides.library.duq.edu/c.php?g=1244791&p=9107996. ⮭
- “Magic Quadrant for Analytics and Business Intelligence Platforms,” Gartner, accessed March 19, 2024, https://www.gartner.com/en. ⮭
- Jessica Urick Oberlin, “Data of E-Resources: Moving Forward with Assessment,” in Technical Services in the 21st Century: Volume 42, ed. Samantha Schmehl Hines (Emerald Publishing Limited, 2021), 155–74, https://doi.org/10.1108/S0732-067120210000042012. ⮭