diff --git a/index.html b/index.html index 9d56d19..a1af4ac 100644 --- a/index.html +++ b/index.html @@ -55,7 +55,948 @@
-

[SPEAKER_S1] Welcome, everybody, and thanks very much for attending the Bernard Bailyn Lecture today. I just wanted to begin by acknowledging the traditional owners of all the lands from which we're zooming in from today, and I'm zooming in from the lands of the Wurundjeri people of the Kulin nations. And I'd like to pay my respects to elders, past and present. And I know we have a big audience in many different locations, so please feel free to write your own acknowledgements in the chat. And my name is Timothy Minchin, and I'm Professor of North American History at La Trobe. And the Bernard Bailyn lecture was actually set up in the 1990s, to sort of recognize Latrobe's traditional strength in North American history. And over the years we've had some wonderful speakers, mainly from the US, talking on different aspects of North American history, and connecting it with other types of history as well. And after a bit of a hiatus with Covid, it's great to be back, in this online format. And, uh, we're very lucky today to have a wonderful speaker from the University of Kentucky. Doug Boyd, who's going to talk on this very topical question of artificial intelligence and oral history, the good, the bad and the ugly. And Doug is, directs the Louis B Nunn Center for Oral history at the University of Kentucky Libraries. he envisioned, designed, and implemented the open source and free oral history metadata synchronizer O.h.m.s, which synchronizes text with audio and video online. Doug is the co-editor, with Mary Larson, of the book Oral history and the Digital Humanities Voice, Access and Engagement, which was published by Palgrave Macmillan in 2014. And he's also the author of the book Crawfish Bottom, Bottom Recovering a Lost Kentucky Community that was published by the University Press of Kentucky. Um. He has served as president of the Oral History Association in the US in 2016 and 2017, and conducted research in Australia as a Fulbright Scholar in 2019, which he was just sharing some stories about that and how he observed the election that that year, and also went to some sports games while he was here. So Doug has connections with Australia. Doug managed the oral history in the Digital Age initiative. Uh, authors the blog Digital Omnium. Uh produces and host the Wisdom Project podcast and has authored many articles pertaining to oral history, archives and digital technologies. So it's wonderful to to have Doug here today. And just before we start, I wanted to thank, La Trobe University for their support of this lecture. particularly the head of department, uh, Kat Ellinghouse and the, dean of the School of Humanities and Social Sciences, Nick Bisley, uh, for their support of the lecture. And also thanks to Judy Hughes, who's going to help today with the monitoring, the chat and the questions that come in. Uh, and also to Charlotte usher in the school admin who's been wonderful at helping out with the organization. so I'll hand over to you now, Doug, and thank you very much for everyone for coming. And thanks to Doug.

[SPEAKER_S2] Thank you very much. I very much appreciate the invitation. I do want to say that, I'm starting to to feel old by, you know, in some ways, because I remember a time way back when when, you know, the invitation to come give a lecture, uh, came and I'd accept the invitation, and, you know, I'd get on a plane and fly to to this wonderful place and, and and, uh, give the lecture, you know, meet, meet, meet people in person. And, and I have to say, when I got the, you know, first three, three lines of the email that where you were inviting me to give the lecture, I was so excited. I was like, I get to go back because we haven't been back since 2019, and here I am in my basement. and the wonders of, of remote, technologies, it is wonderful that I get to, to be with you all. I do hope that sometime soon I can come come back, because that was one of the great experiences of my life, my family's life, I think living in Australia for six months, uh, of course it was Canberra. and everybody would ask why Canberra? That was the number one question. When when Aussies find out they're like, oh, you live in Australia, where do you live? I'd say Canberra. And they'd be like, why? but we loved it. We loved we loved our time. Uh, and I was working at the National Library, so that made sense. So, uh, we're going to talk about AI. Uh, and we are going to I'm going to sort of walk through a little bit of my experience in terms of working with oral history and technology as a way to contextualize and frame what's going on with us. Uh, my center. so I'm going to go ahead and switch to some slides here and, give you some visuals here. So, so here is a very, like. I, I have, I've put, I've put this, this picture as a way to sort of um. Visually reflect on this idea of artificial intelligence and, and this direction that we're going with, with content and media and data. So we'll get past this title slide here. But this is a picture on the right there of of me at the stadium in Melbourne. watching I think the rebels play at my first at my first Aussie Rules football. So the Louis B Nunn Center for Oral history at the University of Kentucky is, an oral history center that began in 1973. We didn't begin as a center. We began as a, uh, oral history interviewing, initiative, essentially. but little by little, we kept at it and just celebrated our 50th anniversary about three weeks ago. Um. In the beginning. As I said, it was just about interviewing. we are in the library, so, uh, as part of our, uh, mission from the very beginning has been this idea of access and preservation. Interviews come into the archive. The idea is that we would provide access to those interviews. And as I think most would agree, you know, one of the imperatives of, of oral history practice is to try to get these individual stories on the historical record. So the Non-center 73. We now have four full time faculty and staff. We have one interviewer. I've got interviewers there because one, we had a retirement. One interviewer three archivists. the majority of the work that we do is through external and internal partnerships. So so we'll partner with the bourbon industry, we'll partner with the horse industry, but we'll also partner with a small organization or a community to launch oral history projects. We have a collection of 18,000 interviews. and right now, as we speak, we have over 50 interviewing initiatives happening. So we're, we've we've got a lot of things happening, on a not enormous budget. but it's very exciting. Oral history is growing here in the US as it is, I think, around the world in terms of the methodology and the awareness of of the professional oral history community. So. Is just a sample of some of the interviewing projects that we have right now that's that are active are Women in Bourbon Oral history Project. Uh, climate research, policy and activism is a project we're working very hard on. We've done a lot of interviews on with the Peace Corps. the, uh, um. Yeah, it was a Covid era project and interviewing, uh, Peace Corps returned Peace Corps volunteers, and it was one of those projects that was just made for zoom, because these people live all over the world and we can interview them anywhere now. And it's wonderful. So we started that project in June of 2020. During Covid, we now have over a thousand interviews just in that project alone. So, so it's very active. We also do local local history. Lexington, Kentucky, University of Kentucky. but uh, do a lot of civil rights work, do a lot of politics. And in terms of what we do, so we talk about the collection and access to our collection. This is one of the things that I've really sort of focused my career on, this idea of, of enhancing access to oral history. So here's where you go to access some of our interviews Kentucky oral history. Org and and you can search, you can browse, you can actually listen watch interviews, search transcripts. This is where you actually access that, system that I created, which we will talk about. I think that when we think universally about, well, back up and say that, I think when I started my oral history career, I truly, actually blessed to have worked only in oral history for all but a year and a half of my professional career. When I mean oral history, I don't mean I'm doing anything else. I'm literally archiving only oral history. I'm only directing an oral history grant agency. I am directing an oral history center that does just oral history. So I get to think about just oral history all the time. So I think about things like, for example, the year 2008, I'm coming back to the University of Kentucky to direct this center. And I think at that point, as it were, the, you know, decade before that, the dirty little secret I think about oral history is that we could, everybody loved to talk about the interviews that they conducted. everybody loved to talk about the communities that they engaged while they conducted these interview projects. But the reality was very few people were actually accessing interviews in the archive. And so so, you know, when we have a collection of 6000 interviews and only 100, 150 interviews would be accessed in a year. You know, that's frustrating because there's a lot of material here. And so I spent a great deal of my time wondering why, how can we make these interviews more accessible to the general public? And, you know, I think the bottom line is you need something to search. Contextualized audio and video is a very cumbersome to access. Uh, to go through 300 hours of audio when there's no transcript. So we need text for discovery, human generated texts, uh, human generated transcript is expensive and labor intensive and slow. And even when interviews are transcribed, it's still not connecting to the audio. Even, you know, when you would sort of present it in a digital environment, people would just throw up the transcript, search it, throw up the audio, hit play. But then you still have to kind of find your way around. And, and, you know, I use the analogy of a cassette when your favorite song was on the b side, the second side of the cassette, and in the middle, how difficult it was to actually find your favorite song and navigate to it. Those of you who are are old enough to remember listening to music on cassette. You appreciate how hard we had to work, actually, to actually find our favorite music, and oral history was no different. Discovering information in time based media is a challenge, and we couldn't do a whole lot of transcripts at the time. So in 6 in 2008, just to give you a sense, we had 6000 interviews. Majority were still analog, few transcripts, limited metadata, no public interface. And I say 300. It was probably really actually 200 interviews were being accessed each year. So again, I focused a great deal on on accessibility, getting these interviews, you know, enhancing access. So I created this OAM system 2008 Elms really did one thing in the beginning. It allowed you to search a transcript and click on the time code. That corresponds to the moment where your search hit, uh, actually takes place or appears in the interview. And it would it connected your textual search to the audio and the video, uh, of the interview. Very simple. few years later, I implemented the indexing system, quite frankly, because we couldn't afford to continue to transcribe everything. so, so in an index, you can create a range of metadata about the stories that are being told throughout an oral history interview title, partial transcript, synopsis, keywords, GPS coordinates, things like that. So so you can allow the researcher to dig in and get a sense for what that interview is about very efficiently. It's a different search. You're searching for stories as opposed to searching a transcript for words. So when you search a transcript for words, that's great. That's fantastic. However, somebody could be talking about living under segregation for three hours and never once mentioned the word segregation. So so I think the idea was, was creating a way where we could map natural language to concepts. Now, keep in mind this I'm talking to you about artificial intelligence. This is human generated. So somebody's listening to an interview and typing and tagging these moments, as you go. One of the more popular features of the Omes indexing. Is this idea of of GPS coordinates. So while somebody is talking about a place in this case Philadelphia, you can listen to them telling this story and then click on the map and be taken to that place on the map, which is fantastic. So, so I think. Yeah, this indexing really pretty much transformed access to our collection, as well as others. When we got the grant to make the system open source and free in 2012, I think it was, people started using it. here's a timeline. Got the grant in 2012, 2014, it became available 20 2021 over 700 institutions and 60 different countries are now using this system to enhance access to their collections. And it really did make a difference to our collection here at the University of Kentucky, where we, you know, now have 18,000 interviews. We have about 6500 interviews that are that were providing enhanced access for. So it either has an index, a transcript, or it has both. We are generating about 500 new indexes, time summaries each each year. So 500 interviews going online each year for researchers. And last year, 238,000 interviews were accessed around the world. So we've gone from 200 300 interviews a year now to 238,000 interviews being accessed around the world. So it really has transformed access to our collection as well as so many other people's collections around the world, which is really gratifying. And I could tell you all kinds of stories about that. But that's not what this talk is about. We're we're talking about we're going to talk about AI. So, so this idea of taking the process of oral history, where we're conducting an interview, we're going to deposit it in the archive, and then somebody is going to take off work and travel to your archive and sit in a research room and listen or read transcripts. Really. was not working, especially for oral history. And so the internet comes along, but not just the internet, because it really took an additional push, tools like OMS to enhance access, so that these interviews would be available to, to a wider audience. So once again though. We're talking about the AI aspects of this. So this is some AI is something that I think a great deal about. So back to this kind of semi weird picture. you know, when I break down this idea of the good, the bad and the ugly, it was actually really a fun exercise for me to go through. And it's sort of thinking about how I can, you know, break this down. I think in retrospect, I would have actually reversed the order. The bad. No, the ugly, the bad and the good. So we could end on the good note. But I did structure the talk based on the good, the bad and the ugly. So so be warned. But essentially what I'm talking about, I think with regard to the good, I'm talking about access and discovery when I'm talking about. The bad. We're talking about the privacy issues and the bias issues that are are really ever present and intensified, potentially by AI. And we're going to talk about that. But but then we talk about the ugly. And we really get into this idea of post-truth authenticity, challenges that are going to be faced by, by, the field of oral history for sure. So let's focus on the good. Let's talk about what we're doing at the non center, what I'm doing in terms of access and discovery with regard to AI. The non center is going to be implementing um and already has implemented on some level AI um as a way to process information in our collections. You know we talked about ohms right. Ohms is umms is this great thing. It's been wildly successful. Um. It's still human generated. It's still labor intensive and requires text still for search. So how are you going to create that text? You're either going to create it using a human indexer who's going to listen and type, or you're going to take, a transcript and you're going to create a transcript. In the past, a human would sit down, you know, the best, the best. Transcript, has always been the human who sits down and types word for word. They're never perfect. but, but we try in oral history to get them as close to perfect as possible. And by perfect we mean a verbatim representation. So the Holy grail for oral history has traditionally been speech recognition. You know that that people have been waiting around and talking about that for 20 plus years, thinking it's going to happen. And then the experts would say it's never going to happen. You know, it's it's it's it's always going to be flawed because accents, you know, that it's not just about the Australian accent versus the American accent versus somebody's accent in, you know, Niger. It's it's about accents. Even the regional, you know, that that, you know, from from region to region here in Kentucky, you can go from county to another county and have a different dialect, a different accent, and it can be very thick and very difficult. Machines would just really traditionally struggle, I think, with heavy accents. So. So it would get a lot of things wrong. And I periodically about every year, uh, thereabouts, dive in and test the most popular speech recognition systems. And this one, I did do a test and this is an example of what that test looks like. I go in and identify where, it gets things wrong here in the yellow highlights. You see where it made a guess and it got it wrong. in that first example, Claremont versus Claremont, it's a different spelling. Claremont is California. Claremont has spelled with an E is is in Kentucky, which is where the Jim Beam distillery is located. And, that's a common mistake. And if a transcriber doesn't look up, you know, where the Jim Beam Distillery is, they might make that same mistake. So other things where it just flat gets it completely wrong. So further down on the left there, you see bar sounds. They're talking about a town called Bardstown, Kentucky. and it just completely gets this wrong. This test was run a few years ago, by the way. So but many of the systems still, I did a test in many of the systems will still give we'll give that, uh, very similar results. So, so. It gets me thinking about what do we really need from speech recognition as a field. So, you know, when we talk about traditional transcript and the role of the transcript in oral history, especially in the archive, we're really talking about accuracy verbatim perfection. We want perfect. And perfect is a very difficult bar to attain. So that makes it expensive. And I started thinking about and I started this thinking when I was in Australia, I started thinking, well, what if we use speech recognition not to achieve perfect, but what if we used it to get to mostly accurate, mostly verbatim? Good enough? Um. Would it be useful? And and so? So. The answer is yes, because we can go to, you know, a graphic where where I'm going to show later on. the word, the searchable terms, are pretty, pretty accurate, I think in speech recognition, the terms that people would normally search, I guess that's what I'm saying, is the terms that normally would be searched are pretty, pretty accurate in general. Not perfect, but accurate. So as I said, I did a test and we're going to do a quick test here. All right. Human versus the machine. All right. We're going to listen to a one minute clip. All right. So this is taken from our Black Women and Bourbon oral history project. All right. This is an interview with a woman named Fawn Weaver who launched a distillery because, um. It's called the Uncle Nearest distillery, and she discovered the story of Uncle Nearest, who was the enslaved person, who worked for Jack Daniel's in Tennessee, who taught Jack Daniel's how to make whiskey. All right. And she's launched this whole new distillery and this whole new brand. All right, so here we go. We're just listening to one minute. All right, here we go.

[SPEAKER_S3] Tell me about fun before Uncle nearest premium whiskey.

[SPEAKER_S4] I was same person. It just different focus. But I've been an entrepreneur my entire life and an author for the last 15 years. And all of that is just kind of come together with this particular project. But fawn before. Fawn after same person.

[SPEAKER_S3] So I read that you first learned about Uncle Nero's story from a New York Times article that was written by Clay risen, who you share a birthday with.

[SPEAKER_S4] Yeah. Let's play, let's play.

[SPEAKER_S3] Uh, so what did Nero's green story mean to you as a black woman and also a whiskey drinker.

[SPEAKER_S4] I don't know about as a black woman, but just.

[SPEAKER_S2] Okay, so. So this was a test I. Actually, had student staff away for the summer. I think it was when this interview was conducted. And so I didn't have anybody who could clean up the interview. Traditional practice for us is to pay a vendor to transcribe the interview, but then, have a student go back through the interview and just make sure all the terms are right and correct. We call it authentication. Other people in the field call it, you know, auditing or or final edit whatever. but making that last pass seems to be a best practice. so here is what was actually said, right? So so then, I run this through speech recognition and I'm going to compare it to the human. And so I'm not going to play the clip again. But there was one real interesting deviation in the beginning of this interview, in this clip. And that was where the interviewer says, you know. You learned about Uncle Nearest story from a New York Times article that was written by Clay risen, who you share a birthday with. The interviewer starts to laugh. They both sort of kind of laugh, but while they're kind of talking over each other, uh, Weaver says. Yeah. Clay. Clay. Love. Clay. Like what we see back there, right? Uh. Go back. There we go. Yeah. Love, Clay. Love, clay. That's what she said. Um. And so. So, you know, this is clearly a mistake. And this is a very common mistake that a machine would make, right? Because somebody's talking over another person. And it's hard enough for two people for a machine to get the two people thing. But when they're talking over each other, the machines just traditionally implode. That's the equation for failure. there's a couple things machines really don't like. They don't like accents. They don't like, a lot of background noise. But talking over one another, it's awful. Like that's when the machines really implode. So of course, that said. I'm just going to point out the fact that this interview was the one, the one that got it wrong was the human being, the human being who we paid double, so we paid $250 per interview hour to transcribe and authenticate this interview. Again, I mentioned that we didn't have students who could work on the authentication that summer. and so we paid students or we paid the transcriber double the rate. And this is a company we've used for 20 years, and they're one of the best in the business, truly. yet the human got it wrong. And for the first time, the speech recognition got it correct. I think for the first time that I've seen where I put a very challenging audio clip or video clip to the machine, and it was able to beat the human. And this was only one example throughout the rest of this transcript, where the machine actually beat the human in terms of accuracy, this was using a free system called whisper. Whisper is recently released by the people who created ChatGPT. It's a library, so it's not a service that you can just upload to. So it has to be something that you, you know, install on a server and, you know, things. And there's there's actually an app, a really nice little app that was released called Mac Whisper. If you're if you're a mac user, the but the machine beat the human in terms of accuracy. And this is a very big deal, I think in terms of representing what's possible now, because we have got to a point where we could generate text for every single interview in our collection, 18,000 interviews. In the past, we would just index because we couldn't afford to transcribe everything. We would transcribe maybe 100, 150 interviews a year. But we brought in a two, 2200, 1953 interviews last year. There's no way we're going to be able to transcribe all of those. It's not. I say it's it beat the human inaccuracy. Here's what that transcript actually looked like. There weren't you know, you know, go back. There weren't the names and the colon and the nice formatting. It's just a big block of text. and so I would say it takes about 6.5 hours per interview, our formatting and cleanup to clean this stuff up. So but this opens up a whole new world because suddenly we have textual ization. We have the ability to go in now to this data, these these transcripts and extract keywords. So there are methods now with AI to go in and just sort of do some natural language processing is what they call it. where we're going to go in and we're going to, to extract what we think is a thing that, you know, essentially. So they call it name entity recognition, named entity recognition basically pours through a bunch of text and looks for people, places, things, events, all kinds of things, and extracts them for you. Now, I'm going to switch over to actual live web here. which is always scary, but here I took Fawn Weaver's interview right here and plugged it into this name entity recognition engine that uses it, uh, uses a named entity recognition engine called spacy. And here we go. It's got the people. it's got dates. it does locations. The New York Times, it classified as an organization, but it gets the author, Clay reason and names that person and identifies that person as a person. Clay risen. That's a that's a tough name. That's not a common name. To have both of the words in clay Risen's name are words that mean other things, you know, outside the context of a name. So that is an example of where the machine got it right. I just ran this about an hour ago. So so so this is named entity recognition is something that that is very real now and very possible. In fact, we've got a grant where we are currently working to incorporate named entity recognition into the OAM system. And Ohms is free. So when you take your transcript and you upload it into ohms in about a year and a half ohms will create those names that that metadata for you um automatically. And so, so you'll be able to really go in and, and the possibilities are really pretty fantastic where, where we can go in and, and take some text that hasn't been cleaned up and extract some really good metadata from that text and really make things searchable without having to fully process the interview. now I will say that that I am. Designing systems now. And in every single case, as I design these systems for processing oral histories on the archive side, there is a human phase to this. Um. Named entity recognition is very flawed. We have we'll use the bourbon example again. the, the, Jim Beam. Jim Beam, if somebody mentions Jim Beam, well, the machine could actually go in to that. You know, that name Entity Recognition Engine is going to say, oh, Jim Beam, that's a person. But Jim Beam's also a brand of whiskey. It was a historical person, but it's also a brand of whiskey. It's also a global company. And it's also a distillery, you know, in Clermont, Kentucky. So, so. Odds are real good that the the name entity recognition, at least as it's being deployed these days. We'll get it wrong. So what we're going to do is build into this whole thing, an editor, where as you're editing the transcript, you will actually be able to identify very quickly and easily the names and basically approve them or say, you know what? You said Jim Beam was a person. But in this particular context, we're talking about the brand of whiskey or the brand of bourbon. so this is going to be about a year and a half away where we're going to be doing this. But you'll see on this a series of dashboards. And those dashboards are where the human comes in and reads what the machines are saying, thinks about these things. and then as you go into the editor, you're going to be able to, to, to really create some. Really good metadata very efficiently. And what does this give the researcher? This is going to transform access to oral history collections and archives. Really, truly most oral histories interviews that are in archives are pretty inaccessible. and and really still stuck on, on the shelf, so to speak. So there's a downside to these these, um. In a name entity recognition things. But I think for right now we're focusing on the positive and it's really pretty powerful. So we're looking at a situation where my entire collection can have a rough draft of a transcript and be searchable enough to create, you know, to to have name entity recognition, extract some terms, and really, truly make our interviews searchable at a very high level. So when we switch to, you know, from the good and the bad, you know, we've got a couple of things with, you know, privacy and bias. And I'll start with the algorithm. You know, we talk about biased algorithms today. Well named entity recognition is is absolutely going to have a potential bias. And so so once again we are in a scenario where. AI is being deployed by us. We'll have human intervention periodically throughout the process. there's no doubt about that. Um. It's too dangerous otherwise. And I think, you know, the speech recognition, for example, can be wildly problematic. I'll go back to the Bourbon example. when I was testing systems other than whisper, I kept getting references to Iraq, the country, in these Kentucky Bourbon interviews. Now, bourbon is an international spirit. It's common in Japan, very common in Japan. It's it's taking off in Europe. Australia. and, um. You know, Russia is popular in Russia as well, but Iraq is not one of those countries where I would expect them to be talking about this. And so so I ran the speech recognition. I ran named entity recognition, and it kept pulling out all these references to Iraq. Well, what's happening is it's making a reference to the rack house. and the rack house is where they, the warehouse where they store the bourbon barrels, where the bourbon ages. And so, so again, you know, blindly using auto, you know, artificial intelligence and natural language processing is just not going to cut it. So the human being has to intervene throughout this process and again, creating these dashboards. So. One of the things that really haunts me is this idea that as we're accelerating innovation, we're also accelerating risk. You know, the Ohms has really transformed access to the Nunn Center's collection. That's that's a given. It's fantastic. And I love sort of touting that. But it's also done is actually, increase the number of times somebody has called me and said, please take my interview down. You know, you know, I'm applying, I'm applying for a job and your your interviews are number two on a search hit of my name, and they're talking about some deeply personal health conditions in their interview or talking about their sexual identity or, or whatever. It's something deeply personal and they don't want it everywhere. They signed the deed of gift informed consent said, hey, this interview is going online. You okay with that? Yes, I'm okay with that. But the circumstances changed and it's a little different when our interview is number two on a search hit of your name, which again, that was part of my career intention was like, how can we take interviews and release them from, you know, from this sort of the shackles of, of, of the archive. And we did that. But careful what you wish for because here just some, some things in a small sampling of interviews that we had references to that we discovered while we were indexing the interviews. so we absolutely have transformed our workflow from a privacy standpoint in terms of looking out for certain triggers. And I've got a whole thing about how we've done that. but but this is to say that privacy is a real concern for most oral history interviews, because most oral history interviews are about details about somebody's life. So I think the big frontier for oral history, especially, I don't think anybody is getting informed consent. Right. How can we explain to every individual that we interview the true ramifications of putting interviews online? And, and, you know, because we don't really understand what those ramifications are ourselves. And so I think I think we need to do a much better job from that perspective, you know, and I think not just informed consent, but informed Accessioning I snuck this picture in because I put this picture in when, I came home one night, to our little house in Canberra. And, uh, Americans, this, like, freaks them out. They're like, oh, you lived in Australia, you know, and they all think that, you know, I'm going to get bitten by a brown snake or or that, you know, this. I'm going to find this spider in my shoe. You know, Americans really, truly have this, this, this feeling that they're they fixate on a lot of the things that can kill you in Australia. so I put this in there to freak everybody out. But like, somebody in Australia pointed me out, pointed out while I was down there that this is one of the good guys. This is the one, this is the Huntsman, I think. And this is the one that, like, you kind of keep around. Anyway, informed Accessioning is about knowing what's coming into the archive. The only way we can do that realistically is through automation. So so now we can sort of generate a transcript. And then what we're going to do in this, in this thing that we're doing with AI is we're going to actually build in and automate the sensitivity analysis based on what we find in that speech recognition. and so looking for references to addiction, looking for references to, to sex, looking for references to deeply private personal things, sexual assault or drug use or crime. You know I think what are the different words that are going to potentially, indicate that there could be a potential problem in this interview from a deeply personal and a privacy standpoint, it doesn't mean we're going to, you know, edit anything out. That doesn't mean that we're going to ditch it from the archive. But the archive needs to know that we have, you know, some of this deeply personal information. And we never knew that in the past. So as we marched through this thing that I'm sort of envisioning here, where we're going to upload interviews, we're going to transcribe those interviews, we're going to take that draft transcript and we're going to extract keywords from them. We might even use, you know, a AI engine to generate a summary. we create a dashboard where we can approve things. and then we're going to have a really great editor where we can go in and edit that speech recognition transcript. That's a little rough around the edges, figure out, you know, that it got some of the named entities and place names and things wrong. It's this place where we're going to hook up, to something called Wikidata, where if somebody makes a reference to Canberra, well, Canberra has got a Wikidata entry plus GPS coordinates and that's automatically going to be harvested. And so essentially you're going to be able to to semi-automate this process of creating an index by simply editing the transcript, which is going to be fantastic from an uh, from an efficiency standpoint. Then we've got the sensitivity review where that transcript is done. Now, tell me if you think there's something in this transcript or this interview that we need to to really examine from a privacy potential privacy standpoint, and then we're going to export that interview and and move it on down the line. So, you know, your typical interview is going to potentially, you know, I did a little search on my my database recently for maiden. Name. You know, 75 hits elementary school, 408 hits, best friend, 150 hits. So if we were going to reset our credit card password, those are three questions that we would use to reset a credit card password. So so I analyzed four minutes of an interview that is the most friendly interview, less the least dangerous interview I could find. Right. So this is an interview with a returned Peace Corps volunteer. This person in the first four minutes reveals these entities. That reveals the fact that, you know, they served in the Peace Corps, their full name, the where they were born, the year they were born, the name of their schools, you know, they went to in this case. Oops, sorry. They went to Notre Dame University, and made a reference to the Fighting Irish, which is the mascot for the Notre Dame and typically Notre Dame. If you make a reference to the Fighting Irish, you're talking about the football team. So this person is probably a football fan. Their Catholic went to private school early on. So so all these details in the first four minutes, and so the data points that you're creating that can be triangulated, are remarkable. So, remarkable. And something that I think is problematic, the privacy problem preceded I this was a problem that that that really was something that came before the AI. Um. Revolution that has, has, has now upon us and quickly gaining ground. but I is intensifying this. How are we going to feel when our oral history interviews are being used to inform ChatGPT? Or is that a good thing? Because ChatGPT is inherently biased. And so we can actually use our materials, just like in a search engine where some of these interviews could potentially give better information. But at the risk of of, you know, potentially exploiting the people who we interviewed, whose stories we pretty much hold sacred. So, uh, it creates difficulties, there's no doubt. I think as we get really close to moving from the bad to the ugly, I think this is where we really look at this idea of, of, authenticity and this sort of post-truth world that we live in. We could always edit audio and video and you could tell when video is edited, always, you know, but this idea of, of creating things from scratch, generative AI, we're generating something completely new where before it didn't exist. Right? In this particular case, all I did was take a photograph that was taken and, and, you know, slap on some backgrounds to it. Right? This is something that can be automated. You subscribe to a service, upload your pictures, create a character. and then you can make that character do anything, wear anything and be anywhere and, and put out photographs for this. So which photograph is the real one. Anybody in the in going to guess am I at am I a commentator at the Aussie rules football game? Probably not. I spend a lot of time in recording studios. or am I in this weird warehouse thing? that looks like a church, except for the thing behind it is red and multicolored. Which one's the real one? Anybody? It's this one on the left where I'm, um. I'm in this weird, church like place. so. So the. Editing audio has always been easy. I can take somebody's interview and make that person say something different, right? I can rearrange words and change the cadence of speech. That's a different thing that we're talking about. When we talk about speech synthesis. We are talking about taking a sample of your voice. All right. Enough of a sample of somebody's speaking. Voice recorded can create a model where I can type something in a word processor. That will be your voice, your voice entirely. And so so you can make something where somebody says, I think he did a wonderful job while he was in office. Those were really eight really good years to I watched him commit crimes while he was in office. In fact, I'm going to tell you about one in particular and name names, you know. So so the idea of speech synthesis is very real quick search on Google. And you actually can actually find all the apps that will allow you to do this, that will allow you to upload audio and create. Your Doug Boyd, and you could have somebody saying, you could have somebody who appears to be me saying anything that you tell it to say. So this idea of a cloning generator and this idea of Photoshop for the voice, which is really frightening, um. When you think about the political ramifications and you think about, you know, just all kinds of things, I think, are potentially scary. You know, the actors in Hollywood went on strike, and one of the major tenants was this idea of of taking, uh, being an actor and having your eye replica made so that we can actually have, you know, a, you know, movie created, uh, where the actor didn't actually act in the film, but we it looks like the actor is acting in the film. The most famous, I think, models for this is the, you know, the Tom cruise deepfake. if anybody is on TikTok, the deepfake, uh, Tom cruise, where somebody has actually faked Tom cruise and has this regular account that puts out content and it's real. It's real. Watch, uh, just put it out there.

[SPEAKER_S5] You know, I do all my own stunts, obviously. Uh, I also do my own music. I've got a sweet spot for a couple of artists, and, uh, people are surprised that I'm a big Dave Matthews guy.

[SPEAKER_UU] Sings like an angel. You got your ball.

[SPEAKER_S2] All right, so that's enough of Tom cruise fake Tom cruise seeing Dave Matthews. But that's Tom Cruise's laugh. You know this is Tom Cruise's voice. This is Tom Cruise's face. It's been generated completely from scratch. It's a 100% fake. It's not like Tom. They took footage from a movie of Tom singing the guitar and just put in this Dave Matthews voice. No, this is completely generated. And so so when you think about the implications on oral history, that's where we get into this conversation. You know, during the Trump years, a lot of people talked about fake news, right. Well, it's inevitable that we're going to move into fake history and that that that is very real for us to start thinking about. And in the historical community, specifically an oral history, I think, where we're talking about fake history, but fake history is being drawn upon by fake primary sources, primary sources that have been 100% fabricated. so as we think about this idea of of the role of oral history in our world, in society, this is going to really be a challenge because we are getting to the point where you can't hear something and really, truly believe it. You can't see something and really, truly believe that what you're seeing is real. And and I think this idea of of. This new emerging role for historians and archivists, where our focus is really going to be on, you know, being authenticators in some ways of saying this is. This is a real interview, and here's how I'm going to tell you and prove to you that this interview wasn't a complete fake. that was concocted in 17 minutes by, you know, a teenager on a computer. So, so. I think when we talk about. this. I want to leave time for questions. And so I've got, I'm right about at the time that we said we're going to to to stop because I want to have conversation about this with you all. but I talked about reversing how I wanted to be like, I should have done the bad, the ugly and the good just from a general morale standpoint, because this deepfake thing just bums me out completely. And it's going to be a challenge that I don't think we're quite ready for right now, not just as an archive community, not just as a community of historians, oral historians, folklorists. I don't think society is quite ready for this. So so so we've got a lot of work and catch up to do. so this summer, this past summer, I got to spend a couple days in London with Al Thompson. And Al's not here. Uh, I think al told me he had. He got the dates wrong, and he's he's going to have to watch the recording. So, al, I'm making reference to you while you're watching this recording. I did this while we were in London because we were given a we were both participating in a symposium at the British Library, and, uh, we were talking about everybody was talking about AI and ChatGPT, you know, and how ChatGPT is going to kill the, the, the, the research paper as a genre and how our professor's going to handle that in the classroom and, and whatnot. And so, while I was on stage, I actually typed in to ChatGPT a prompt construct a fictional oral history interview transcript for an interview conducted by Australian oral historian Al Thompson, and focus the interview on the concept of fatherhood in Australia. And I just watched it go and it started going, and, you know, text was scrolling up the screen as it literally created before our eyes live, uh, an interview conducted by al for the oral history interview or for the Exploring Fatherhood in Oral in Australia conducted by Al Thompson with somebody named John Anderson. I don't know where that came from. on this date, May 18th, 2023, I think that was where the data was. I put in the prompt, it's 30 minutes and and it begins with Al Thompson saying, g'day, mate. which which couldn't be more biased, right? I mean, I think here, you know, I'm, I'm asking ChatGPT for an oral history interview conducted by an Aussie, and it starts throwing around. G'day, mate. And, you know, I'm Al Thompson, you're friendly Australian oral historian today. I'm thrilled to have a yarn with John Anderson, a true blue Aussie dad from Sydney. We'll be having a fair dinkum chat about fatherhood in Australia. Welcome, John. So I was going to try and squeeze every Australian ism in there, stereotypical Australian ism into that first paragraph, but it goes in. And if you were to subtract out the sort of quirky Australia isms from this transcript, there's actually some really interesting, intriguing stuff being talked about here. You know, the, you know, and this person is reflecting on fatherhood, which was the, you know, one of the features of our focuses of, of Al's, most recent, I think, oral history initiative or one of them. So, you know, getting into gender roles and how they've loosened up and, you know, redefined things for, for fathers and, and, you know, it really it you know, if you again, can can filter out. Some of the ridiculous. There's some real there. And just know that at that point there's enough Al Thompson that has been recorded that we could we could actually model Al Thompson and and get him actually speaking in an interview, potentially with some with John Anderson, who we, we make up from, from the ether. So, so I think this, this really is going to push the limits, I think of, of our practice in ways that, that we don't know. you know, really and I think this idea of. The interviewees that we work with. You know, we talk about shared authority and we talk about, you know, um. Trust. It's going to be really a challenge moving forward as these interviews become available and start to get reused and become data, some of these life history is going to become data that is informing these AI models. and, and we're going to have to grapple with that. So I'm going to turn this over to the Q&A now. And we're going to to move away from ALS interview with the fake Australian and open the floor for some questions if you have any.

[SPEAKER_S1] Yeah. Thank you Doug. That was a really wonderful, you know, far reaching presentation. Really thought provoking. I think there's a lot to really reflect on there. So, the format is you can type questions in the chat and Judy is here helping to moderate it as well. So does anyone want to start.

[SPEAKER_S2] Converter in this format to field questions.

[SPEAKER_S1] can I, can I? There's a question from Richard Broome, one of one of our colleagues. can I be used to direct detect fakes?

[SPEAKER_S2] Yes, indeed. I think the, the the fake detection thing, I think, um. Is indeed something that I think is, well, there's going to be a ton of money spent on on this. In fact, I have a friend who launched a, a company that focuses on, on, forensic analysis of video and basically proving authenticity of video. there is whether I can detect the fake or not. That's going to be the question. I think we're going to be looking at a sort of arms race approach, uh, where where, you know, the one technology is going to get good and the other technology is going to have is going to lag behind, and it's going to have some catch up. And once it catches up, there's going to be, you know, some awkward periods of time where I think the AI will not be able to, to, to, to detect things. There is this, thing. What's it called? It's the um. Content Authenticity initiative. And it is sort of this coalition where, where, smart people are really thinking about this because this I mean, this is dangerous stuff that we're dealing with. and, and, my camera, I've got a camera, I think behind me that I bought and I bought my first really nice, photography and video camera. It's a Sony. and it's what's called a full frame camera. which, which I decided my old man hobby is going to be photography. It comes with. The built in capability to create a sort of cryptographic signature of every photograph that I've taken. And so that basically that signature is created at the time of creation and is basically going to be the thumbprint for that photograph, for proving that it was an authentic photograph and not generated by AI, which is really interesting. So these things are starting to happen, and I think I would like to see that technology, make its way into audio recording and video recording. So basically the digital file will have that sort of cryptographic signature potentially to to live with it, the file throughout its life, so that we will have that. Until then, we're going to have to sort of make up other solutions. But but for now, I think that's that's got real possibilities.

[SPEAKER_S1] Okay. Thank you. Richard. so there's a question from Catherine Travis about, that you raise interested in complex issues, very relevant for a project currently being conducted to bring together the multiple language collections that exist across Australia, including oral histories. one of the goals of this project is to work across disciplines and create data sets and tools for use from different perspectives, and also for community members. Is that something you have been thinking about, and does AI make any difference for that, facilitating it or making it more complex?

[SPEAKER_S2] Absolutely no. I think about this all the time. And I think, you know, getting back to the good of, of of what we can get from this. You know, when I talk about how speech recognition, you know, people just ten years ago were saying it's not going to happen because, you know, it's going to take these models and you're going to have to teach the machines your model. And if the language, even the accent shifts, you know, the machine is going to struggle. And the difference ten years ago was we didn't have supercomputing and AI. You know, as a way to, to to crunch through this data so much more efficiently and effectively. And so the ability to do that is really, moving forward very quickly and very powerfully. So this idea of doing this project that is going to bring together multiple languages and these languages or, I'm sorry, multiple language collections. Um. You know, again, ten years ago, they were like, never going to work. Good luck. but the models are now teachable and and iterative. So this idea of, of of feeding a language into, these machines that is actually going to potentially, you know, be able to, to continue and live and be dynamic as every, you know, update happens. I mean, I think speech recognition is is an interesting example because technically, each time you upload an interview into somebody's speech recognition model and you make a change or correction, you've just taught that company's model. you know, and it learns. It will learn from all the mistakes that it's being, that it's being told that it's being made. That's why a lot of the speech recognition engines in the United States, there's one called Mint.com. they want you to edit inside their editor, because that's how they're letting that's how they're teaching their machines, where their machines are messing up. Now, we have so much data out there that's teaching the machines. I mean, who knows? Like the Amazons and the Googles where they're drawing their data set from. I mean, YouTube is one of the the most amazing data sets for speech recognition out there now. And so this idea of then applying that potentially to indigenous languages is going to be really exciting. And I think really interesting. I mean, more so than, you know, not just indigenous languages, but but I think, I think we're going to see some real possibilities there. And I think this is absolutely something that that I think I will make a big difference in, in doing. And doing successfully.

[SPEAKER_S1] Yeah. Thank you. I'm just going to combine a couple of questions. about the implications, uh, for the sort of life narrative methodology. and how are you planning to continue to provide access while protecting privacy? Uh, privacy.

[SPEAKER_S2] Yeah. Um. I really you know, this was this is one of the focuses of, of this this conversations that we were having at the British Library and I think this summer and I think. Part of the goal was to advocate for the life narrative approach. Not everybody buys into the life narrative approach. It's it's it's labor intensive and it's costly. and there is this idea where, you know, I could interview 15 people 1 or 2 times, as opposed to interview that one person 15 to 20 times. And I think that's that's something that I think is really, um. You know. So we were brought together to kind of talk about what are the advantages of the life narrative approach and the life narrative approach, really, I think. Shines from a detail standpoint, like a really nuanced detail. That's what it does really well because it gives the the space for people to kind of explore. yeah, this really high level of detail, which is really fantastic. The problem is, is it is. Got so many data points that I think the ability. Um. You know, really walked out of that, that symposium thinking, oh my gosh, like, you know, do we need to wait till everybody's dead, you know, before we unrestrict their, their life narrative, oral histories and, and, you know, because we have some projects where, where those interviews are out there and people are excited about them, being out there. but it really pushed me to think, you know, do we really need to, you know, again, informed consent. Didn't really cover that. You know, when you just said, well, there's this internet thing, we're going to put these interviews online and it's going to be really cool. It's going to be searchable in this really cool system that's going to link people from the moment you, you know, say the moment where they read something to the moment where you, you know, hear it and read it and all of that, and it just really, um. I think. Really? Yeah, I'm torn about that approach now and its availability. And I think starting to think about ways that in my own collection strategies, how can we actually limit access. You know, I think the first part of my career, I joked that, like, people would kind of roll out digital Doug and talk about all the magical things you could do, uh, you know, and then Ohmes came along and it got even more so. But now I'm coming out and saying, there are all these things potentially wrong with this approach of what we're doing. And so, so I'm having, a crisis, I think in terms of, of that, and we are starting to talk about implementing a click through before somebody can access our interviews. so each time you say, I want to play an interview, having a statement come up that will, you know, hopefully try to prevent bots from harvesting it automatically. and, you know, and just basically tell people, you know, that, you know what? I feel like the archive should be telling people from an ethical access standpoint. Um. So we're going to implement that in 2024. which ten years ago would have been like that's a barrier to access. You know, we can't do that. And and I've, I've, I've definitely flipped on this a little bit. I'm still an advocate for enhancing access to the interviews, but very more mindful of what interviews are appropriate to be out there in this world, I think, in this moment.

[SPEAKER_S1] Yeah. Thank you. it's a good question from Ruth Gamble about how does AI technology work across languages? And does it break down or reinforce English's hegemony? And will we be able to include smells and other senses in interviews soon?

[SPEAKER_S2] Oh, yeah, that's a different question. But yeah, but like let's start with the language thing. The language thing is I was I came in with the assumption, same assumption that is implied in this question that it's just going to be very, you know, English focused. my center is doing a partnership with Westchester University, which is a university in Pennsylvania. Westchester University is doing this. Their students are working in a collaborative oral history project, that's happening in Germany, interviewing Ukrainians. So Ukrainian students who have left Ukraine, and so these interviews are in Ukraine. and and so they were like, you know, they send me this email and they're like, well, you know, are you guys going to be able to transcribe this? And I and, you know, and I was like, ah, and then I looked and the system actually does it. And so, so we were actually able to, to to actually create a Ukraine transcript. And then I sent it back to them thinking, it's not going to be good, it's not going to be good. And they came back and were like, hey, this is pretty good. That was their feedback. And I was like, oh, that's amazing. So I'm looking at whisper right now and and whisper, sure enough, you know, will do speech recognition in Afrikaans, Albanian, Amharic, Arabic, Armenian. Um. Languages I can't pronounce Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, you know, it goes on and on, like like it's Haitian, you know, that's one that I've always thought, you know, because I have a collection of 125 interviews with Haitian, Haitians who survived the 2010 earthquake. It's they're all in Haitian Creole, and it's one of the most expensive projects for us because to transcribe Haitian, we're like paying like quadruple. You know what? We would pay just somebody to transcribe a typical English interview and, yeah. So so I think I is really crossing over and might be the thing that breaks apart this English hegemony and, and and gets us, um. Pretty good translations potentially as well. So it's not just about creating the transcript in that language, but they're also creating these translations that are pretty good. And we've all seen, I don't know how much anybody else has messed with like Google Translate and things like that. In the beginning it was like, oh, you can't trust it. Now you can use these apps like on your iPhone to have a pretty accurate conversation with somebody in real time. If you have a good data plan, international data plan, you can actually have a conversation with somebody in a language you don't speak. And that's pretty fantastic. So so I absolutely think that the smell that's going to be the smell taste, you know, thing I think is going to be interesting. I think there's going to be a lot of focus on visual for a long time. I think we're going to get into like 360 degree video. I think that's coming. It's already here, but I think it's going to be coming at the interview level. I think 3D is coming. you know, I mean, one of the classic AI models that we have here in the United States is the Shoah Foundation. That got the big grant from whoever I can't remember, but they used the golem suit, you know, for Lord of the rings. you know, where they there's a human actor playing Golland he wore this light suit and they, they sort of took that and were able to animate that from this actor, you know, who was really sort of acting all of that. they put that suit on a Holocaust survivor and interviewed the Holocaust survivors. and they would create different perspectives. My understanding is they would create different perspectives. So they'd have, like a child interview, that Holocaust survivor, they'd have a man interview the Holocaust survivor, they'd have a woman interview the Holocaust survivor. And they do this data set of all of these things. and then they have an enormous data set of 52,000 interviews with Holocaust survivors. And then they create a hologram. Of that Holocaust survivor and I can. I've done it. I've walked up to this hologram. And I, you know, and the the director of the Shoah Foundation was like, ask him anything. And I just said, are you were you afraid? And this holographic, Holocaust survivor basically explained how he was afraid and why he was afraid, and in all the ways that that fear manifested. And it was astonishing how how real it felt. So I had lunch with the director of the Shoah Foundation after this, you know, we're having lunch. And I was just like, you know, it's really impressive technology. It's really great. But you got to admit, this is a little creepy. And, uh, and he was like, that's what I used to think, you know? But the reality is, there's nothing more powerful than a Holocaust survivor coming into a classroom and speaking to students. You know that that's just one of the most moving things. And he said, and we're getting to a point where we're mathematically not going to be able to do that anymore. And so, so this idea of of creating the holographic experience, I think we're going to start seeing more of that. I think a lot of emphasis is going to go into this sort of immersive visual, uh, environment, obviously with like, you know, wearing the big goggles and having this sort of immersive, immersive experience. I think we're going to see that bleed over into oral history, probably before we get into things like taste and smell. but, uh, but I think that taste and smell thing. I think it's probably going to happen, but but we're going to see a whole lot of of activity happening on the visual front before we get to that point, I think.

[SPEAKER_S1] Okay. Thank you. there's several questions about consent questions. can you can you anticipate risks or can consent forms be changed in this new environment?

[SPEAKER_S2] I mean, my archive, you know, a lot of this a lot of this conversation happened where, like, interviews were conducted in the 1970s where consent forms didn't envision the internet. You know, the, you know, the archival community in the US really grappled and grapples with this idea of what do we do with those legacy interviews where the person really signed the paper but didn't really truly understand that this thing was going to globally connect us and and put this interview at the fingertips of everyone. Most people's perspectives of what it meant to archive something really, truly meant it's going to go in a box. It's going to go on a shelf. Somebody, some researcher who's working on a dissertation is going to access this and write a dissertation, and then that book is going to get published by an academic press and be read by 13 people, maybe picked up by a class. So maybe we'll push that up to 45 people. but, you know, the limited audience just made the privacy implications of, you know, and the personal, deeply personal implications of what's in that oral history interview. Very minor. because it just, you know, it was about scale, and risk. But now I think it's different. So, so the non-center, you know, kind of pushed through the, that period of time with the sense of, well, we're just going to be very vague. You're going to you're turning your interview over to the archive and the archive, you know, the rights are transferred, but we're not really getting into that. Everybody's kind of putting in a future technology clause into their archival deed of gift that says, you know, you know, in all future delivered by all future technologies. there's a legal way of saying it. I can't remember what it is, but it's like basically, all future technologies that we can't even envision yet. so that legally covers us. But that's not the same as ethically and and so, so from an informed consent, you know, right now, I'm with our project partners, I'm bending over backwards to say, look, you need to really, acknowledge to individuals that their interview is going to come into the archive, but it's meant to be heard and it's meant to be listened to. and so we're we're big about providing future interviewees with, you know, links to previous interviews that show the system that they're going to be delivered in. you know, currently, you know, something's going to come along and replace my home system, I promise. I know that's coming. and that's okay. but, but it really helps, I think, to show people at least now what a sort of enhanced access to their interview looks like. but there's no way we can really envision all the ways, you know, somebody can reuse that interview after they've downloaded it and played around with it. Now, we don't allow downloads generally of our interviews. but but it's, you know, people can pirate these interviews. I pirated the Tom cruise deepfake video off of TikTok so that I could put into my presentation. People can people can do these things. And so, so I think, I think we've got a lot of work to do. On the informed consent to really explain. I think the main thing is to overemphasize this idea of, what is happening when you give an interview to an archive in general, and that is that somebody is going to listen to this and that somebody could be in a different country working on something entirely different than this oral history project that this interviewer is currently working on. and I think also this idea of. Providing the option where they can embargo or add some temporary access restrictions. I think we're going to start to see more and more of our interviews get restricted during people's lifetimes, and I think that's probably a good thing. you know that not every interview should be online today in this moment. And, and that's definitely a big shift career wise for me in terms of messaging, uh, where I'm moving into this, this, this different zone where I'm really thinking about the intersection of ethics and technology, uh, a great deal.

[SPEAKER_S1] Okay. I think we just have time for one more question, because pretty much out of time. but, uh, do you think that I will make people more unwilling to have oral histories recorded? Was one of the questions.

[SPEAKER_S2] Yes, yes, there's no doubt in my mind, I think. You know, when Facebook and social media first came about, people were putting things on there that you just knew, like, you know, if you were an adult at that time and you looked at that and you're like, you are going to regret putting that out there in ten years, why did you do that? You know, and, and, and I think what social media has turned into is a very for a lot of people, it's turned into a very highly curated version, you know, of themselves. And so, so, you know, people got a little bit more discerning about what they're going to put out there. And I think that's going to bleed over into this because, because as people discover and understand more fully the possibilities of, of the technological landscape that we're going to be living in, I think I think we're going to get a lot of reluctance in terms of people willing to share. And, and I think. If they're reluctant. I want them to be reluctant because, you know, again, from an ethical standpoint, I think. I don't want. And I get calls from people who, ten years later say, I just I knew it was going online. I just didn't understand it was going to be a number two on a Google search in my name. And, and I'm, you know, I have this deeply personal moment in this interview. Please take it down. And I think I think that's that's very real. And right now I think we're going to we're in a phase now. But as we proceed through this phase, I think we're going to start to see people curate their life stories a little bit more in a more controlled way. Just as we sort of curate our social media profiles. Some of us, and I think that that, that's going to come with some great hesitation to, to participate in oral history, and we're going to have to come up with ways that I think, assure our narrators or interviewers that that we can properly take care of their interview and curate it effectively, and safely. But the reality is, again, just as you know, anything digital seems to be something that somebody can pirate and somebody can misuse. And I just don't want it to happen on my watch using my material, you know, in our oral history collection. And so we're going to continue to try and explore ways that we can continue to protect, you know, our interviewees.

[SPEAKER_S1] Yeah. Thank you. But, I mean, I think there's a lot of positive in what you talked about. I think the, the increase in access is a really good thing that you, you gave the data about how, you know, increased to 238,000 access accesses, you know, as you say, when in the sort of old model of it being a transcript or a recording, an archive was actually very limited in the number of people that would use it. So I think my.

[SPEAKER_S2] My, my favorite calls, my favorite calls to get, you know, I get takedown requests, but my favorite calls are when somebody calls and says. I was able to discover your interview with my great grandfather, and I'd never heard his voice. You know, thank you so much for preserving this story. and, you know, I've got these great stories in our archive of global discoveries happening of, you know, that that are just making the world a much smaller place. And, and I think really sort of playing into this idea of why we all got into this business in the first place connecting people, connecting stories, and creating new meaning. And I think, you know, I think that's happening on a very large scale already, and I think AI is going to intensify that. So we're going to intensify the good that's happening from this. And I think in this world, this, you know, especially here in the US where there's just so divided politically but so divided culturally, this idea of people actually listening is so important. And whatever we can do, I think to to take this oral history thing that we are, are, are doing and getting people to kind of listen and connect to the material and connect to each other, I think is is just so important right now.

[SPEAKER_S1] Yeah. Thank you. I think that's a really good thing to hold on to and to to have as a sort of take home message. thank you so much, Doug, for giving us this great lecture. And we hope you can now get some rest before you go to bed. And, thank you to everyone for attending. And the the recording will be available for those of you that want to, you know, listen to it again or for those of you that missed it. So I'll, uh, Judy and I will work on that now. and thanks to everybody and hope you all have a great day. And thanks for attending. Thank you, thank you.

+
+ +
+ +
How do we make people more aware of their personal data?
+ +

+ Doc: + We + have + two + selves + in + the + world + at + any + given + time + now. + We + have + the + physical + self, + our + flesh + and + blood, + our + voice, + our + presence + in + the + world + which + extends + beyond + our + bodies + but + lives + in + this + physical + space. + There's + this + other + space, + we + started + calling + cyberspace + a + long + time + ago, + but + it's + a + real + thing. + It's + a + data + space. +

+

+ Julian: + We + are + as + people + quite + often + data + illiterate. + We + don't + realise + the + impact + of + what + data + has + on + our + lives, + we + don't + realise + what + we're + giving + away + and + we + don't + realise + the + mechanisms + that + will + enable + us + to + re-empower + ourselves + in + this + environment. +

+

+ Adrian: + Security + and + privacy, + they + are + issues – + people + do + care + about + them + and + we + need + to, + we + should + address + them. +

+

+ Alexandra: + I + think + unfortunately + at + the + minute + we + make + people + aware + of + their + personal + data + when + terrible + things + happen. +

+

+ Aleks: + It's + going + to + take + personal + experiences + of + falling + over, + some + kind + of + truly + horrific + experience + before + people + actually + feel + that + they + are + compelled + to + be + educated + about + this. +

+

+ Alexandra: + And + I + think + there's + two + pieces + of + data. + People + feel + very + strongly + about, + one + is + health + care + and + the + other + one + is + banking. + So + if + you + touch + those + two + areas, + the + reaction + is + extremely + strong + because + they + feel + that + it + sort + of + touches + something + that + they + should + be + in + complete + control + of. +

+

+ Doug: + The + Internet + is + manmade + and + everything + you + put + online + is + recoverable. + There + are + world-class + security + experts, + but + there + are + still + people + who + are + stealing + money + from + bank + accounts. +

+

+ Adrian: + As + technologists + we + we + have + a + duty + to + try + to + explain + these + things + to + people + and + to + try + to + get + across, + you + know, + find + ways + to + make + it + real + and + make + it + make + sense + and + we + need + more + examples + and + better + education. +

+

+ Julian: + We + need + to + create + a + more + informed + debate + about + data, + especially + the + value + of + data. + The + value + of + data + as + individuals + and + the + value + of + data + aggregated. +

+ +
What are the disadvantages of managing your own data?
+ +

+ Doc: + Right + now + there's + more + talk + than + ever + about + owning + your + own + data, + because + there + are + so + many + companies + out + there + that + are + gathering + data + about + us, + that + we + don't + own + at + all, + that + we + don't + control + at + all. +

+

+ Glyn: + So + if + we're + trying + to + manage + your + own + data, + one + of + the + problems + with + this + is + you + don't + necessarily + know + what + data + you're + giving + to + other + people. + So + lacking + information + and + lacking + in + any + way + of + finding + out + that + information. +

+

+ Jon: + So + I + actually + want + it + to + be + in + the + hands + of + agencies + that + are + going + to + do + good + with + that + that + can + maybe + use + my + data + in + comparison + to + millions + of + other + people's + data + to + find + trends, + or + to + find + specifics + about + me. + So + yeah + we + needed + it + to + be + out + there + in + order + to + make + it + work, + and + that's + where + the + tension + comes + in + because + as + soon + as + I + agree + for + it + to + be + out + there, + and + allow + it + to + be + worked + on + it's + a + bit + like + having + a + house + party + when + I + was + seventeen + and + said, + everybody's + welcome. + Well + they + were + until + it + got + a + bit + out + of + hand + and + you + know + with + that + story + goes. +

+

+ Jeni: + I + think + that, + I + think + that + there + are + very + few + people + who + are + willing + to + sacrifice + as + much + as + it + would + actually + take + me + to + not + be + monitored + and + surveilled + at + all. +

+ +
What kind of help is available for people to manage their own data?
+ +

+ Jon: + We + need + to + move + the + debate + beyond + a + discussion + about + what + it + can + be + in + a + commercial + realm + into + activity + in + the + ordinary + domestic + role. +

+

+ Alexandra: + I + think + we + should + be + educating + people + about + data + through + building + partnerships + with + the + companies + who + are + involved + in + selling + those + products. +

+

+ Doug: + I + think + we + already + have + a + middleman + for + open + data + in + an + institution + that's + been + recently + created + the + Open + Data + Institute, + co-founded + by + Tim + Berners-Lee + and + Gavin + Starkson, + there's + some + fantastic + people + that + created + an + accreditation + system. +

+

+ Doc: + What's + going + to + happen + in + the + long + run + is + that + people + will + have + control + over + their + personal + data + because + they'll + know + better + what + to + do + with + it + and + the + tools + will + exist + for + them + to + do + more + with + it, + with + that + data + than + these + other + companies + could. + It's + just + like + it + was + with + personal + computing. +

+

+ Jon: + and + when + you + kind + of + mobilize + a + world + task + force + of + paper + geeks + you + like + making + stuff. +

+

+ Doug: + I + don't + think + we + need + technology + businesses + to + sit + there + and + vouch + for + people's + data. + That + seems + to + me + an + old + way + of + doing + things, + I + think + an + open + source + approach + with + five-star + accreditation + from + the + ODI + seems + a + good + way + forward. +

+

+ Doc: + The + trap + not + to + fall + into + is + the + trap + of + fear + right + now, + and + we're + at + a + high + point + of + fear + thanks + to + Edward + Snowden, + thanks + to + discovering + what + the + NSA + and + the + US + has + been + doing, + and + what + GCHQ + here + has + been + doing. +

+

+ Jason: + If + you + can + build + a + sneaky + feature + into + a + device, + and + you + decide + not + to + tell + the + user + you + decide + to + kind + of + break + the + law + a + bit, + certainly + in + data + protection + terms, + theres + a + real + risk + that + it + will + get + found + out. +

+

+ Doc: + This + is + a + power + we + can + use + for + good + or + evil + and + probably + both, + but + we, + it's + present + in + the + world + now + and + we + have + to + figure + out + how + to + use + it. +

+
+