Love Data Week is coming back to FSU in 2023! Love Data Week, or LDW, is an international event where individuals and groups are encouraged to host and participate in activities related to any and all data. It occurs every week that Valentine’s Day lands, and focuses on helping people learn about the best data management practices and methods for interpreting data. LDW was started in 2015 and is headed by the Inter-university Consortium for Political and Social Research at the University of Michigan. For those looking to learn more about data or are interested in statistics, this is an excellent opportunity to ask questions and get started!
Because looking at raw data can sometimes be boring, we’re looking to spice things up this year by including two new activities! We’ll be right inside the entrance of Dirac from 12:00 – 2:00 PM on Thursday and Strozier from 12:00-2:00 PM on Friday! First, we’re going to be doing an Adopt-a-Dataset activity, where participants will be able to “adopt” one of the openly available datasets we have displayed. Your task will then be to determine what conclusions can be drawn from the data, and you’ll receive a Dum-Dum for your work! After that, we’ll have a jar of Smarties at the table, with a list of numbers from a normal distribution on hand. From there, you’ll have to guess the number of smarties in the jar, and the person with the closest guess will win them all! In addition to the tabling events, our Research Data Management Librarian, Dr. Nick Ruhs, will be giving a workshop on Data Analysis with Microsoft Excel on Valentine’s Day (February 14) from 3:00-4:30. If you are or will be using Excel for your projects or research and are looking to enhance your skills, this will be a great workshop to attend!
In addition to the wonderful events that are occurring during Love Data Week, we will be publishing two blog posts introducing the two new Data Fellows at FSU, Reagan Bourne and Sahil Chugani. In those posts, you’ll learn all about what inspired them to become a data fellow and how they became passionate about data analysis and management techniques.
For more information about any data questions/concerns you may have, you can either check out https://www.icpsr.umich.edu/web/pages/ or contact Dr. Nick Ruhs, our resident Research Data Management Librarian, at email@example.com. Furthermore, if you ever need any assistance with any data question you may have, you can check out the walk-up hours for our STEM Data Fellows!
This blog post was written bySahil Chugani (STEM Data Fellow) from FSU Libraries.
Data literacy is the combination of a few unique skill sets: statistical literacy, information literacy, and technical proficiency. It also involves being able to visualize, critically evaluate, determine the accuracy and reliability of, and understand data sets. There are many reasons why it is important to be data literate, especially in recent years with the advent of the internet and social media. Data literacy is also crucial to many different industries and research areas. It is important to interpret the data that you are collecting to make sure that the results are accurate and to be able to understand that data so that you can create useful visualizations for others.
There are a variety of concepts to keep in mind when critically evaluating data. For example, you need to consider the methods that were used to collect the data and whether those methods are ethical. Furthermore, when evaluating how the data is presented, you need to consider whether that representation or visualization is the most accurate way to portray the data. Another particular topic of concern is bias. There are different points at which biases can be introduced, such as when data is collected, when it is analyzed, and when it is shared with the public. Also, if you are critically evaluating your own data, it is important to check that there are no biases within your own work. In this post we will be discussing the critical evaluation of data through the lens of data collection, data presentation and visualization, and data ethics.
In the context of data collection, several different collection methods can be used for research. Some of these methodologies, such as focus groups, surveys, and participant interviews, are familiar to the public at large. However, there are other specific data collection processes that many people outside of certain academic disciplines may not be aware of, such as web scraping/text mining, phlebotomy procedures for blood tests, observational behavior recording for time series data, and many more.
Consequently, not only is recording the data itself of importance for experimental duplication purposes, but it can also be important for interdisciplinary work. Some fields of research may have different research data collection methods that researchers in other fields may not be aware of, even across seemingly similar disciplines. For example, accounting and finance may seem similar but can have drastically different ways of interpreting monetary data. The way accountants and financial analysts calculate when a company is at a net zero (i.e., a break-even) between revenues and costs is different. Even within the same field of research, transparency with how data is collected is important for peer review – whether it be for ethics accountability or determining methodological flaws within research. An incomplete set of data can make it difficult or impossible to know whether or not the data was collected in a way to prevent bias, and further make it impossible to know if the data is accurate and/or precise.
Failing to document data and data collection methods can also create problems reproducing or using the data for further research, particularly if things such as question types, experiment conditions, and units of measure are not properly documented. For example, while the hypothetical idea of cold fusion (nuclear fusion performed at room temperature) would be a low-cost energy solution, the experimental methods and data were not recorded. As a result, the concept of cold fusion is now widely looked at with skepticism because none of the data was recorded! A less extreme case where incomplete data may cause research problems is that the way that a survey is constructed can bias responses. Therefore, documenting how a survey was written can be helpful in evaluating why a research study came to a specific conclusion, as well as testing whether or not changing questions or even question order would change results.
Furthermore, data cleaning – which is the process in which things such as incorrectly formatted data, corrupted data, etc are reformatted or fixed so that it can be used in analysis – can also contribute to statistical bias(es) via things such as eliminating outliers, accidentally losing a variable, how you decide to categorize your data, and more. Therefore, documenting how you clean your data is also a critical component of research – explaining what outliers you decided to keep or remove and why can help you and other researchers down the road. It is also important to consider the order questions are asked in and the way questions are worded when conducting surveys. While it might seem counterintuitive at first, the way that questions are ordered and worded can impact the percentages of people that respond in a certain way, whether or not potential participants qualify for research projects, and even the numeric values of the data itself.
Data Presentation and Visualization
Most have probably heard the phrase “label your axes” at some point, even before college. It is often mentioned in K-12 education, with the pretense being that someone will not know what your graph(s) are depicting without them. While this is indeed correct, labeled axes constitute only one of many different components of data presentation and visualization.
Figure 1: Axes that are labeled!
A good place to start on the types of ways that data visualizations can be best implemented would be The Data Visualisation Catalogue. While the site was originally established with graphic designers in mind, Severino Ribeccca himself stated “I felt it would also be beneficial to…anyone in a field that requires the use of data visualisation.”(Ribecca n.d.) As such, almost anyone who uses data typically has to consider how to visually communicate data in a way to an outside audience, or even the general public outside of the realm of academia. A nifty feature of The Data Visualisation Catalogue is that there is a way to filter recommended data visualization types by what concept you are trying to demonstrate.
One consideration when looking at a data visualization is whether the data is represented in a way that is appropriate for that specific data type. While it might not seem like the data presentation would differ between data types, certain visualizations will serve to more accurately and sufficiently depict different types of data. For instance, data related to time and Geographic Information Systems mapping produce distinct data types. While they can be combined and represented in the same graphic (i.e., how has the land of a certain area changed over time?), they both have their own distinct issues to consider to make sure that you are not creating misleading graphics. Namely, one cannot make a map with time data alone, and a map would be hard to make with a line graph that is meant to show trends in time.
Furthermore, the scales and units that are utilized in a data representation are also important considerations! Using our previous example, we can note that the visual scales of a map are different from the visual scales of time series data. For instance, you can get drastically different data visualizations if you transform data from a linear scale to a logarithmic scale (i.e., a scale that plots data based on what exponent would be needed to get your number back). This can be useful for situations where the data you are working with is so large that it is hard to see everything in an efficient way. For example, a logarithmic scale of time where millions of years are condensed into smaller numbers that are easier to conceptualize leads to graphs where you can see things like different geographical eras.
On a more human scale, while logarithmic data could be used to misrepresent data, a far more common tactic for misrepresenting data involves a truncated or broken axis on a graph (Figures 2a and 2b); a truncated graph deliberately not starting at zero on the y-axis, and a broken axis subtly skipping a large amount of units. This is a common tactic that is present in some graphics that news outlets might use, whether it is intentional or not. Some other characteristics of misrepresented data might be plotting two graphs that are not on the same scale or zooming your scale in to make a trend look far larger than it truly is.
Figures 2a and 2b: Graphical Examples of a graph with a broken axis and a graph with a truncated axis, respectively
While there are many examples of distinctly misleading graphs, there are also many graphs that accurately portray the data, but use an incompatible or inaccessible color palette. Related to this, many color palettes used in data visualizations can be inaccessible to those with vision impairments such as green-red and blue-yellow color blindness. Utilizing distinct color-blind friendly palettes can help to make visualizations more accessible. Furthermore, using alt-text descriptions of what the graph is showing enhance the ability of screen readers and other tools utilized by those with low-vision and blindness to interpret the visualization. Thus, being hard to see or just looking aesthetically displeasing does not make a graph misleading, and is an important distinction to make (although the two are not mutually exclusive!)
Figure 3: A “Painbow” Graph
When examining a dataset, it is also important to consider whether there are any biases present that may affect interpretation of the data. Two common categories of biases are cognitive biases and statistical/algorithmic biases. Cognitive biases involve individuals interpreting the results of a study to best fit a specific narrative. This may involve a data producer deleting data that does not fit the conclusion that they are trying to prove. At the same time, a data producer may also add data that is not accurate in an attempt to strengthen their claims. Furthermore, studies may be designed to collect data that only represents a small subset of a population, while claiming to be representative of the entire population.
Similar to cognitive biases, statistical/algorithmic biases describe the concept of bias as your sample poorly describing your population. In that context, it is significantly mitigated (if not outright eliminated) if your data collection methods are not generally or statistically biased. This is particularly noticeable when examining artificial intelligence (AI) algorithms. These algorithms are often trained with unequal datasets, which then leads to skewed results when performing data analysis with said algorithms. Therefore, when examining data that is outputted by an algorithm, one should consider whether the algorithm has been trained with accurate and equal data sets. An industry where statistical and algorithmic biases are extremely important to consider is the healthcare industry. For example, many hospitals use artificial intelligence to sort through patient data, which helps doctors determine who needs immediate emergency attention. While there are many benefits to such algorithms, there have been issues in the past because of them. In certain instances, if a patient has pre-existing medical conditions that affect their health, the algorithm will not be able to take that into account. In addition, many algorithms that are commonly used in healthcare systems are racially and gender biased. As mentioned in “Algorithmic Bias in Health Care Exacerbates Social Inequities — How to Prevent It” written by Katherine Igoe, “algorithms in health care technology don’t simply reflect back social inequities but may ultimately exacerbate them.” Igoe also mentions that certain prediction algorithms used for detecting heart diseases in the medical industry were biased in their design. For example, the “Framingham Heart Study cardiovascular risk score” worked very well for caucasion patients, but not for African American patients. This is due to the fact that around 80% of the collected data used for this algorithm was from caucasian patients. Utilizing such an unequal dataset to train the algorithm can lead to unequal care and treatment in medical practices (Igoe). This example is just one of the many examples of bias due to algorithm design.
Companies such as Amazon have also faced huge problems relating to algorithm bias. A few years ago, Amazon tried to utilize an algorithm that used artificial intelligence to hire new employees. However, it turned out that this algorithm was biased against women. This is because the algorithm was trained on resumes that were submitted during a time period where the number of male applicants was significantly higher than the number of female applicants. This ultimately caused the algorithm to be trained to favor men over women.
Critical evaluation of data is an extremely important skill set for any student or professional to have. Knowing the importance of checking the reliability, accuracy, and the bias in any data set is necessary when reading or working with data. Some questions to keep in mind are: is the collection method clear and documented? Is the data visualization appropriate for the dataset and for what the author is trying to represent? Is the data biased in the collection or visualization stages? It is important to evaluate data to ensure that we are using quality and accurate data to make sound decisions and conclusions.
Alessandro, Brian d’, Cathy O’Neil, and Tom LaGatta. “Conscientious Classification: A Data Scientist’s Guide to Discrimination-Aware Classification.” ..Org. Cornell University ArXiv, July 21, 2019. https://arxiv.org/abs/1907.09013.
Edwards, Brent D. 2019. “Edwards, D. Brent. “Best Practices from Best Methods? Big Data and the Limitations of Impact Evaluation in the Global Governance of Education.” Edwards, D. Brent. “Best Practices from Best Methods? Big Data and the Limitations of Impact Evaluation in the Global Governance of Education. 39: 69–85. https://doi.org/10.1108/S1479-367920190000038005.
Lavalle, Ana, Alejandro Mate, and Juan Trujillo. “An Approach to Automatically Detect and Visualize Bias in Data Analytics.” CEUR, Proceedings of the 22nd International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with EDBT/ICDT 2020 Joint Conference (EDBT/ICDT 2020), 2572 (March 30, 2020). https://rua.ua.es/dspace/bitstream/10045/104029/1/2020_Lavalle_etal_DOLAP.pdf.
Loo, Mark van der, and Edwin de Jonge. Statistical Data Cleaning with Applications in R. Hoboken, NJ: John Wiley & Sons, Inc., 2018.
Mark, Melvin, Kristen Eysell, and Bernadette Campbell. “The Ethics of Data Collection and Analysis.” New Directions for Evaluation, 1999 1999, no. 82 (November 5, 2004): 47–56. https://doi.org/10.1002/ev.1136.
Paradise, Elise, Bridget O’Brien, Laura Nimmon, Glen Bandiera, and Maria Athina (Tina) Martimianakis. 2016. “Design: Selection of Data Collection Methods.” Journal of Graduate Medical Education 8 (2): 263–64. https://doi.org/10.4300/JGME-D-16-00098.1.
Silva, Selena, and Martin Kenney. “How Computing Platforms and Algorithms Can Potentially Either Reinforce or Identify and Address Ethnic Biases.” ..Org. ACM Digital Library, October 24, 2019. https://dl.acm.org/doi/fullHtml/10.1145/3318157.
The Midwest Data Librarian Symposium (MDLS) is an annual conference aimed at providing Midwestern librarians, as well as others across the United States, the chance to network and discuss several industry issues and topics related to research data management. This year the event was co-hosted by the University of Cincinnati, The Ohio State University, and Miami University, as well as virtually through online Zoom conference calls and presentations. With free registration to all participants, MDLS focuses on the goal of providing low-cost networking and educational opportunities for established professionals and developing librarians of the future. Relatively new to the environment of Research Data Management, I was eager to represent FSU and the entire state of Florida at the Symposium, being the only participant in attendance at the conference from the state. While I could not travel to participate in the in-person programming, the free registration allowed me to actively engage with the virtual conference presentations and events, like many others over zoom meetings.
Whether it was a zoom scavenger hunt or a presentation surrounding a less talked about subject, like “Making Infographics More Accessible”, I found that with each opportunity to engage I was able to learn something new and many things that I could bring back and put into practice in my own work. The presentations also left me with a lot to contemplate and consider, opening my eyes to information and concepts I had yet to broach or discover through my own work, like Digital Curation and Data Management for filmmakers and documentaries. For example, in the growing industry of filmmaking there are many times limited resources, especially for independent filmmakers, to effectively meet the costs to preserve their data. With barriers, like high memory file capacities, time constraints, and the threat of file corruption or loss of data, documentaries have a much more indirect path to successfully serve as critical sources of historical and cultural documentation.
The vulnerability of data collected in documentaries further illustrates the broader importance to take serious measures to securely store raw data, especially with its potential relevance to guide other research. Additionally, metadata’s pertinence in other research frameworks encapsulates the expansive benefits of open science and universal accessibility. Pressures of academic viability, publishing, and performance can direct researchers’ hesitancy to relinquish ownership and control of data. This exemplifies the utility and demand to create stronger avenues to motivate the open sharing of data even when it is imperfect or incomplete. Procedurally, sharing upon request protocols have been imperfect, to say the least, as the decision to distribute that data is left at the mercy of the Primary Investigator of the original research that was conducted, who may have internal or external factors that motivate, dissuade, or even obstruct their ability to share the data in a timely or consistent manner.
While there were a variety of different topics covered during the conference, several presentations were based around the new National Institutes of Health (NIH) Data Management and Sharing (DMS) policy that will come into effect at the beginning of 2023. More specifically, there were discussions about the effects of this new policy on data management and sharing, as well as how to prepare and instruct those in need of support to navigate through these changes at a university level. For one of the main presentations on this topic the authors conducted semi-structured interviews at their university to survey the research data service needs of their constituents, as well as to gauge and collect their perspectives in relational proximity to the new governmental regulations being put into place. These interviews produced a myriad of noteworthy and interesting observations to take away. Perhaps the most surprising theme to emerge was that many of the researchers and professors were unaware of or unworried about the policy changes, believing that they’d be able to adapt their research practices and proposals when the new year began. Others wondered about how strictly the new policies would be enforced, especially with loose criteria for what might qualify submissions as exceptions and with aspects of proposals not tied to scoring to motivate researchers to put more effort into adopting practices that promote open science. Additional implications of being able to recognize and remove protected health information further supports the importance of collaboration when it comes to properly following research assurance, protocols, and proper maintenance as well as storage of data.
These interviews revealed that many students and faculty across the country were uninformed and/or ill equipped to seamlessly handle this transitional phase that will take place in the coming months to comply with the new NIH DMS policy. Perhaps an even larger overarching takeaway that can be applied is that the general level of informational literacy is relatively low in association to student needs and the expectations that they must meet in order to perform adequately in their field. Adjustments are necessary to overcome the deficiencies in standard coursework that often operates on a foundational assumption that students will come into their academic institutions already having research skills and a working knowledge of information systems, catalogs, and databases. In most cases an established base of informational literacy is required to locate or know that library resources for these causes even exist. Libraries as well as universities more broadly must make an effort to publicly promote their services and resources more widely, while also making them more accessible to effectively address this dilemma. Without additional infrastructure to develop these skills, students have a much larger barrier to overcome the limitations embedded in the university academic framework. Taking levels of privilege into account with access to both technology and experience must also play a part in the organization of their practicum.
As always each institution has its own individual needs as well as priorities and is equipped with different resources to be able to develop the necessary systems and resources to provide its student body with enough support to navigate through all academic challenges. Conferences typically follow a shared academic code of free exchange that open science bases itself on principle. Just look at the public accessibility of most universities’ research guides that they produce and publish and one can truly get a sense of the collaborative instruction that academic libraries strive to achieve. The symposium offers an opportunity that amplifies this ideal, allowing different institutions to come together to cooperate and exchange different ideas through dialogue with similar like-minded individuals trying to reach mutual goals.
Preparing for the Midwest Data Librarian Symposium, my impression was that I’d simply be attending lectures where I’d experience most of the learning. However, in addition to some of the networking events and opportunities, the interconnectedness and interactive components of the entire conference made attending the symposium a much more well-balanced exchange of ideas and information. Moreover, MDLS hosted a slack channel to further promote ongoing discussions and networking, as well as archiving notes that all participants were given access to and permission to contribute as well for each presentation and event. In addition, many of the presentations that were longer than the five-minute rapid-fire “Lightning Talk” featured aspects of involvement from the audience, whether it was through discussion questions, breakout room consultations, or jam board collaborations to exchange ideas on different subjects. The integration of technology was applied seamlessly and improved the overall quality of engagement within the presentations and symposium as a whole. Attending this symposium gave me the chance to consider and discuss countless ideas to bring into practice with my own work. I am grateful for opportunities like these and experiences that enrich professionals at all stages in their careers with an academic environment of common interests and goals.
Author Bio: Liam Wirsansky is a second-year MSI student at Florida State University and the STEM Libraries Graduate Assistant at FSU’s Dirac Library. He currently serves as the President and Artistic Director of White Mouse Theatre Productions at FSU and acts as the Director of Research and Development for the Rosenstrasse Foundation. Liam loves the academic outlet that research has provided him as well as the opportunity to educate and assist students in the development of their information literacy skills.
If you have any questions regarding the Midwest Data Librarian Symposium (MDLS), please contact the organizers at firstname.lastname@example.org.
Some Helpful Resources That Were Shared at the Symposium:
For Love Data Week 2022, we are highlighting our FSU STEM Libraries Data Fellows! These posts, written by the fellows themselves, tell their stories of how they became interested in data-related work and their experience as a data fellow to this point. Today’s post is contributed by Diego Bustamante.
Prior to my role as a Data Fellow, my idea of what data is was defined by my previous work with quantitative data collected from laboratory experiments. For example, when I worked as a Research Assistant I recorded quantitative data for chemistry experiments, like mass, temperature, volume, etc. I then conducted statistical analysis on the data in order to draw conclusions from each experiment. I personally enjoy collecting and analyzing data, especially because it can lead to many scientific and technological advancements!
While searching for jobs in FSU’s NoleNetwork in summer 2021, one job title that immediately caught my attention was “FSU STEM Libraries Data Fellow.” The job description was unique amongst other jobs offered on campus. As a data fellow, I was offered the opportunity to develop several professional skills in data reference, co-hosting programming language workshops, writing and publishing blog posts, and many more. I felt like it was a great opportunity and a good fit with my previous experience and skills, and so I decided to apply. Thankfully, I was selected as one of the inaugural data fellows, leading to a journey of professional and personal development that has thus far surpassed my initial expectations.
One of my first tasks in the program was meeting with different librarians at FSU Libraries. In these meetings I was able to learn about different methods and applications for data analysis in a variety of disciplines. For example, I learned that the Digital Humanities Librarian uses a text-mining software to find specific words from books published in the 1800s. She used the data drawn from the software to analyze certain traits of the story by counting the amount of times a character participates in an interaction of this type. This experience helped me realize that qualitative data sets can be used to draw similar conclusions about a study as quantitative data.
Another concept that I have become familiar with while working as a Data Fellow is open data. We discussed this concept during a workshop where we talked about the potential benefits of making research data openly accessible to the wider research community. Initially, I was hesitant regarding the concept of open data, because I saw academic research as a “race” to find a solution to a given problem. However, further discussion of how researchers are compensated for sharing their data made me realize that it is possible to benefit from open data on a personal and global level.
Currently, I am still learning about the many different types of data, its definitions, applications, and its importance. I am also working on developing an open source Canvas module on MATLAB where I explain the basics of the math based programming language in a student friendly manner. I look forward to sharing more about this work in the future!
It’s once again time for Love Data Week! LDW is a yearly, international outreach event taking place the week of Valentine’s Day (February 14-18 this year). The week is focused on promoting good data stewardship and best practices around working with and interpreting data. LDW was started in 2015 and is currently celebrated by academic libraries and data organizations around the world. While every institution celebrates in their own way, common activities include data workshops, social media outreach, and more!
Each year, a theme is chosen around which organizations can theme their Love Data Week activities. For 2022, the theme is “Data is for everyone.” This year, we are shining a light on the “people-side” of data, and on how different folks use and interact with data. Data often means something different to everyone, and how someone interacts with data varies based on their chosen discipline, research project, life experiences, and their own beliefs and values. There are also often inherent biases that exist in data collection, analysis, and interpretation, which can affect one’s own impression of a dataset. Despite these differences, the ability to critically evaluate data and interact with it is a universal skill that is crucial for everyone.
As technology continues to evolve, the infrastructure needed to run this technology gets more and more sophisticated. Processes and tasks carried out by personal computers, smartphones, and appliances are increasingly automated and run with minimal input from the user. This is made possible through code that is developed with one or more computer programming languages. However, with the increase in the quantity of software and programming applications, the demand for programmers and the number of languages they are required to learn has increased. Furthermore, many employers now require skills in data analysis and computer programming as prerequisites for job applications. In this blog post, we will discuss the most in demand languages in the market and give a brief explanation of each. (Grand Canyon University 2020; Jiidee 2020; Meinke 2020; University of California – Berkeley, n.d.)
Maybe you’re on Twitter one day and search ‘#Statistics’ to look up some information for your Introductory Statistics course. Before you know it, you scroll through and see several tweets that are also marked with ‘#BigData’, and you’re left with more questions than you had when you started your search. Maybe you try to search for “big data” on Google, see the definition from Oxford, and are then left with even more questions:
How large is “extremely large?”
What kind of patterns, trends, and interactions are we talking about?
What isn’t big data?
Big data as a term has become synonymous with the growth of digital data and the glut of information available to researchers and the public. Furthermore, there is a growing interest by both the public and private sector in utilizing large datasets to provide insight into market trends and to improve decision making. However, the exact definition of big data is sometimes unclear and can vary widely depending on who you ask. Businesses, nonprofit organizations, government agencies, and academic researchers each view big data in a different context and with different goals for its use. (University of Wisconsin Data Science, n.d.)
Above: a Google Trends graph that shows the number of searches for the term “Big Data” from 2007 to 2017
In this blog post, we aim to provide clarity and insight into the origins and definitions of big data. We will also discuss the potential benefits and challenges surrounding big data. In doing so, we will provide some examples linking big data to applications or data that you may interact with on a daily basis.
Welcome to the third post in the Get Data Lit! blog series. This post will focus on my experience working as a STEM Research Data Services Associate with FSU Libraries during the 2020-2021 school year. In this role, I assisted with outreach and education to FSU students, groups, and organizations at Florida State University around STEM research data services.
My name is Paxton Welton and I will be graduating with a bachelor’s degree in Finance this semester. One question that you might have right from the start-why is a finance major working in a STEM-focused role?
When applying for jobs prior to this academic year, I knew I wanted a role that would challenge me and allow me to develop new skills. I believed that being the Research Data Services Assistant would provide me the appropriate level of challenge and opportunity that I was looking for. By and large, I believe that my experience provided me with just that. There was a major learning curve that I faced when I first started this role. While I had a grasp of the basics of data literacy and research data services, I quickly realized I did not know nearly enough to be able to properly speak to student groups about these topics. During the first few weeks of the fall semester, I spent a significant portion of my time getting a stronger understanding of data and everything FSU STEM Libraries had to offer to its students in regards to research data. By reading countless articles about data literacy and engaging in weekly discussions with my supervisor Dr. Nick Ruhs, the STEM Data & Research Librarian, I became confident in my working knowledge on these topics.
As the STEM Research Data Services Assistant, one of my main responsibilities was conducting targeted outreach to different student organizations across campus. When I first started this process I reached out specifically to STEM-focused groups. This process involved me initiating conversations via email with registered student organizations (RSOs) to introduce them to the research data services FSU Libraries offers them. In several cases, we were invited to meet and/or present synchronously to these groups. This gave us a chance to share more in-depth information about our services and just how valuable they are to students. It also gave students a chance to ask us any questions they may have. Getting the chance to directly interact with students and help them find the right resources to feel more prepared for their future was by far my favorite part of this role.
I also had the opportunity to contribute to data-related events hosted by FSU STEM Libraries. Two examples include Love Data Week in February and the Virtual FSU Libraries Data Services Quest in March. My involvement in these events allowed me to see the entire process of creating programming for students. I was able to sit in on brainstorming meetings, give my input on the marketing materials, and create content for the events.
One of my main focuses throughout this year has been to develop and create this blog series you are reading right now–Get Data Lit! The focus of this blog series was data literacy and its applicability to student’s educational experiences. As such, I had the chance to put into practice the new data literacy skills I learned in this role. I also had the opportunity to connect data literacy to real-world practice and explain the importance of critically evaluating data. Doing so made me realize just how important learning data skills are for my future career and education.
One thing that proved to be a common theme throughout all the work I was doing is that data is powerful and knowing how to work with it is even more powerful. From a career in law to a career in fashion, you are going to be working with data in some form. Learning how to critically evaluate data is going to give you the skills you need to stand out in the future.
By taking on a job in a discipline that I knew very little about, I was able to challenge myself and make the most out of this past year. From getting to work on student programming events to developing a blog series, I was constantly challenged and learning something new.
“Data is the sword of the 21st century. Those who wield it, the samurai.”-Jonathan Rosenberg
Data is all around us and we often interact with it in ways we don’t even realize. From using an app to mobile order our coffee to reviewing a chart provided in an article, data surrounds us and has become so intertwined with our lives. However, with the increasing amount of data available at our fingertips, it can be difficult to understand its meaning, accuracy, and relevance to our lives. This is the reason we decided to start this new blog series, Get Data Lit! We realize that data can be difficult to decipher and want to give you the tools to better navigate data you are faced with everyday.