Australian Web Archive Webinar

Thank you Ruby. My name is Andrew and I’m a Trove outreach officer at the National Library of Australia. So today we will be
having a look at the Australian Web Archive, which is good timing as today
also marks three months from when the Australian web archive was
officially launched. In a world-firs,t the Australian web
archive is a fully searchable online collection, available through trove which
brings together the past 23 years of Australian culture. Its people places and
events, as represented on the internet. Just as Australian print publications
such as books, magazines and newspapers have changed over the years, so too has
the internet changed. And the Australian web archive is an important national
collection that has documented these changes through Australia’s history. So
to give you a taste of the ways that the Australian Internet has changed over the
past 23 years, I thought I would start by having a look back through this period,
with a selection of websites that can be found on the Australian web archive. And
in order to that to do that I will switch my video off. So if we go back
to 1996, one of the earliest websites on the archive is from the Commonwealth
Bank of Australia. It was a simpler time with some interesting color schemes, but
you can also get some insight into the kinds of services that they first
offered online, with net banking not so far away. It was also around this time
that the National Library of Australia launched it’s first website. You see, you
can see here that it looks a lot different than it does today, it’s mostly
text-based. And we had a different Prime Minister and here’s the first archived
website for an Australian Prime Minister, with some very stylish wood paneling
decoration, a calendar of his appearances, transcripts of his speeches and interview
and even a kid’s page with a multiple-choice quiz about the
Australian Government which you can still use to test your knowledge
now. But even then organizations were starting to use the internet for
lobbying purposes. Here we have this site promoting the “Yes” vote at the 1999
referendum, in regard to the Australian Republican debate. The year 2000 was a big
year for Sydney, along with the Olympic Games, they also launched the Airport
rail link. On this site you could get information about tickets back when it
would have only cost $15 to travel return from Central Station to Sydney
International Airport. In 2001 we had the centenary of Federation with a wide range of events
publicized through their website, and the internet also became a
valuable source of sports information, with Sporting News, Games coverage and
information about the various teams. You could even download icons and
ringtones for your favourite team to put onto your phone. Farther afield, the web
was also used by the Australian Government to deliver information on
operations in the wider Asia-Pacific region, such as the mission to the
Solomon Islands in 2003. 2004 saw the completion of the gun railway to Darwin and people could buy their own tickets
online from the website. And websites
became a way for cultural festivals to provide detailed information on their
events, such as the Sydney Writer’s Festival, providing details on their
sessions and guest speakers. And of course you could keep up with your
favourite television celebrities and follow Burt Newton’s transition
from Good Morning Australia to Burt’s Family Feud. You could find out which
bands are playing at your favourite music events and at what times. As you can see websites were becoming
somewhat more colorful and interactive. After the change of government in 2007, you can always look back at the ways politicians
use the web for communicating, especially in the late 2000s which saw big changes in the web with social media emerging.
You could also download the official photograph of the Prime Minister, if you
so desired. 2009 was a big year for the National Library. Trove was launched
bringing together a range of online collections through one single portal. We can look back at Prize winners in artistic awards, such as the Archibald
Prize. Or look at how online campaigns
focused on the prevalent issues of the
time, such as the price on carbon in 2011. 2012 saw the introduction of the
Queensland literary awards, a community-driven awards program. And in 2013 Canberra celebrated it’s own centenary with a vibrant calendar of
events and programs. And so we can see how Australia
celebrated the influential people of it’s time, as illustrated here with the announcements of the winners
and finalists of the Australian of the Year in 2014. And we can also see how pivotal events
were reported in the news media online. Also of interest, was
how a government websites handled opportunities for innovation, such as the
online census in 2016. And finally we can reminisce on more recent events which
may already feel like a distant memory. Whatever your personal interests, whether
you’re researching people, places and events in recent Australian history, or
just wanting to go down a virtual rabbit hole of nostalgia, the Australian web
archive has something for everybody. Of course the Australian web archive has
been a long time in the making. In 1996, the internet was relatively new
and had limited usage in society. According to the Australian Bureau
statistics only four percent of households had a dial-up modem, which is
how they would have connected to the internet over the phone line. For
everybody else there are public and state libraries
where you would need to book ahead to get internet access. And we also had
internet cafes, who remembers them? It was also in this year that the National
Library of Australia established Pandora, one of the world’s first web archives,
alongside the Internet Archive in San Francisco and the bibliotheca
alexandrina in Egypt. You can see here a timeline of web archiving around the
world, which I should point out I was only able to find by using a web archive.
But it’s interesting to see the gradual uptake of other countries in setting up
their own web archives over the following 15 years. So the National
library’s first web archive Pandora was an acronym of Preserving and Accessing
Networked Documentary Resources Of Australia. This was a curated archive of
selected Australian online publications, such as electronic journals,
organizational websites and government publications, all selected for their
cultural significance. And then indexed for browsing by title or subject area. In
its first year from October 1996, around 30 or so websites were downloaded. 22
years later, this has grown to over 57,000 websites. And by 2003 it had
become a searchable site, which was eventually incorporated into Trove
search platform. In 2014, the Australian government web archive was also launched,
as a service distinct from the PANDORA archive. It included Commonwealth
government websites dating back to 2005 and this was a particularly useful tool
giving the, the changing nature of government departments which meant that
websites and links were also often changing. Without a web archive much of
this content might have been lost and in 2017 this collection of Commonwealth
Government websites were amalgamated into the trove web archive. In addition
starting in 2005, the National Library began to conduct annual large-scale
harvests of Australian domain content. That is to say websites ending with
.au. Now when it comes to the National library’s physical collection much of
the collecting work is guided by legal deposit legislation, which is a
requirement for all Australian publications to be deposited with the
library’s collections. In 2016 changes to the legislation came into
effect, which expanded this law to include electronic publications and with
them Australian online content. With this in place the stage was set for the
national library to provide public access to all of this content on the
australian web archive. As I mentioned, the Australian web archive was launched
three months ago on the 5th of March. This was a project that aimed to bring
together all of these elements which I’ve just mentioned, Pandora, the
Australian government web archive and archived Australian domain websites. And
we wanted to make these all discoverable in the one place. Not only did it need to
be collected but it also needed to be indexed, so that it could all be
searchable. And to give you a breakdown of how much content we’re talking about,
we’ve made a handy pie chart yet. As you can see the recent addition of the
Australian domain websites adds significantly to the material that was
previously available. This is one of the world’s largest fully indexed, openly
searchable archives of web content. There are about six hundred terabytes of data
across nine billion records in the australian web archive, which is a
staggering amount of information. And to put that in context, if you print it a
600 terabyte word document, it would stretch from Canberra to Cairns. The
australian web archive can be accessed from any device with an internet connection.
You don’t need to visit the library to use it and it’s absolutely free to use.
It’s also important to understand what the web archive isn’t. Firstly it’s an
archive of Australian websites that are publicly available and free of charge ,so
anything that wasn’t made public like for example in your private email
kept behind a subscription paywall, should not be findable on the archive.
Secondly, it only consists of websites that were selected for their
significance to Australia. All those websites that were harvested from the au
domain. Websites, whose addresses do not end with .au for example, will not
necessarily have been captured on to the web archive. The web archive cannot be
used to view flash content on websites and there’s very limited access to audio
and video material. Finally there is also very limited social media content, with
only some Twitter feeds and YouTube videos that have been chosen for
preservation by professional web archivists. So this sounds good in theory
but I’m sure you’re keen to see how it all works, so let’s go into it. Firstly to
search the australian web archive, you need to go to Trove. The address for
Trove is “”. However if you’re on the library’s website “”, you can scroll down and click on the link to trove. From here you can
see we have an array of different options but for now we want to select
‘Archived websites 1996 to now”. And here we are have arrived the main search page
for the australian web archive. From here if we enter the URL or address of the
website, for example if we went “”, the national
libraries website, it takes us through to what is the
earliest captured snapshot of the website. In this case, the 19th of October
1996 at 4:42 p.m. Which incidentally was a Saturday afternoon.
However this is one of a series of many, many snapshots. A saved copy of the
website as it existed on that day and time. You can also see here that there is
this box with some information in it. This box is called the snapshot remote
and this is your tool for navigating through the timeline of snapshots for
any website that you view on the web archive. You can maximize or minimizes
using this little button in the top corner. So we’ve made it disappear and
we’ve made it reappear. You can also move the snapshot remote around, by clicking
and dragging the box around the page. So for example if it’s blocking some
information you want to read, you can just move it out of the way. On the left
sidebar over here, we have a number of tabs which provide information about the
archived webpage that you are looking at. First we have the “details” tab. This
provides details about the snapshot of the webpage that you are viewing
including the title, the date archived and time the URL of the original website,
and links to related information such as trove catalog records. Next down we have
the “collections” tab. If the website snapshot appears as part of a collection
of sites, it will be listed here. Thirdly the “cite” button provides handy
ready-made citations for the site that you are viewing, in a number of standard
forms. And so you can cut and paste the citation and use it to reference the
website. Coming back to the snapshot remote, on the left side here we can see
the date and the time of the snapshot. And then on the right we’ve got some of
our navigational tools. The top arrows is here allow you to move backwards and
forwards through the timeline one snapshot at a time.
So using these arrows we can effectively move forwards and backwards in time,
through each snapshot, as I’m demonstrating here… So these are just going forwards
and backwards, one snapshot at a time. Of course it would be very slow to, to view
them all one snapshot at a time so we have a few other tools. Underneath here
these two arrows takes you firstly to the earliest snap shot of the page in
1986 and then it’ll take you the other, the other, to the right can take you to
the latest snapshot of this website. And in this case this was May, 10th of
May 2019. Another way of navigating the web archive is to use the timeline, which
you can view either by clicking on this little clock here and it pops up at the
bottom of the page. Or we can also see this little arrow here where you can
make that pop up and you’re also viewing the timeline here. From here, you can see
the range of dates that the website was, where you see the range of dates, where
snapshots were taken for the website. And from here we can see that it’s been
saved from 1996 through to 2019. These bar graphs show you how many times a
snapshot was taken of the website in that year. So we can see that many
snapshots were taken in 2014 and 2010. In other years they have been a little more
intermittent. And then at the very bottom here we have a list of dates for each
snap shot. You can scroll through those dates using the scroll bar. And if you
pick a particular date, say the 13th of March 2010, by hovering over that date you see a little smaller image that shows what the page looked like, or you can
click on the page, click on the date, and it will load the snapshot from that date. Our final navigational tool that I
mentioned is the calendar tool here. Again back in the snapshot remote,
there’s a little picture of a calendar. If you select that, here now we have the
same timeline and view years along the top here, but for each year it shows a
calendar which in, which you can see which dates the website was saved. So on
the calendar you can hover over a particular date and you’ll again see a
little picture that shows what the site look at like. And if you click on it, it
will load that snapshot. So there we have that. Now at any time we have these tools
along the top of the page and we can always return back to the main search
page for the australian web archive by clicking on “Archived webpages” here. So
before we go any further I thought this would be a good opportunity to stop for
any questions so far. RUBY: Yes thank you Andrew, we do have one quick question
that’s come in from Christopher and I’m not sure if we can necessarily answer
this one, as it is a little bit of a technical question, but Christopher’s
asked the bar graphs are not showing for me comes up as a broken image so I
believe that was on the timeline section. So have you come across this before Andrew? ANDREW: I have not come across this so far. I would suggest, we would have to get in touch
with us independently and we can have a look into it further.
RUBY: Yeah definitely and Chris if you could provide even just a little bit more
information today, maybe we can look at this towards the end of the session as
well. But hopefully no one else is experiencing any odd oddities in the
website. But that’s all thanks. ANDREW: Okay. So as I mentioned before, the Australian web archive is fully text searchable. This means that whilst it’s really
useful if you know the address of the website you want to look at, you can also
search for keywords. For example, if you wanted to search for say Margaret and
David’s film reviews from the ABC show at the movies you can search for the phrase”at the movies”. I’m going to put them in quotation marks to make it a phrase search. And click search… You’ll see there’s a big list of results. In fact we have one
point one five nine million snapshots, because the phrase at the movies is
quite a common phrase. So when we look at our results, you can also see that it shows the URL or
web address for each of these results. Since, since At the Movies was an
ABC site, we can see that it’s probably one of these under So I’m
just gonna click on that, and it’s taking us back to June 2014. And there they are.
Margaret and David looking glamorous on their couches and might just go to the
homepage for At the Movies… Yeah. So now we’re looking at the snapshot for At the Movies, on the ABC in June 2014. If we want to look at the timeline by clicking
up at the bottom, we can see that the majority of of captures for this
snapshot were taken from 2004 to 2014, which incidentally was the period of
time that the show ran for on the ABC. So if you’re looking for a film review you
can either browse through these years and say pick a random date, if you’re
just curious as to what they were reviewing on
the 18th of December 2010. Sometimes we’ll get a redirect message. And there
we have it, as seen on this date, which is 28th of December 2010. They were reviewing
a number of films. They were reviewing the King’s Speech, Blue Valentine,
Morning Glory, unstoppable etc. You can see there their star ratings. But if you
wanted to click on one of them say King’s Speech, it’ll give you the full
review as delivered on the website at the time as well as a transcript of the
conversation that they had when they were reviewing that film on the TV show.
Of course if you just wanted to search through a list of all of the films that
they reviewed, just go back to the home page… We can go back to, we can go up to
December 2014, which was when they had their big finale. But if you go to the
movie reviews page, you can see that they’ve kept an archive of all of their
film reviews, which you can view by year, by alphabetically by title. Or you know,
if you’re curious about which films got the best ratings or which films you
might want to avoid. Unless you like watching terrible movies. It’s all very
subjective, of course. Now when searching, as I mentioned before, you have options
for limiting your results when you’re viewing the website along the top here,
we still have our trove search bar. And this little button here which is really
useful, is a way of limiting the domain of the site that you are searching
See here, because we already on the site, it gives you the
option to limit your results to that domain. And so if I wanted to search
specifically for a review of “No Country for Old Men”, pick a
random title, I could search at the movies just to help limit the results to
this site and No country for old men… again I’m using the quotation marks to create a phrase
search rather than a single keyword search results. And here we have, again
we’re back in our results view but we can see that there are two hundred
thirty four three eight, two hundred thirty-eight snapshots of pages that
match the phrase “at the movies” and “no country for old men”. And right at the top
here, we have a review from 17th of September 2014 which is my birthday.
And yeah, and we have their review for no country for old men,
which got five stars from both Margaret and David. If we go back to our results,
we can also see a few other interesting features which you might be interested
in. For example you have a review here where David spoke to the Cohen brothers
who directed the film No Country for Old Men and so the, this is really
useful for singing and discovering other content on the web archive which might
be related to what you’re looking for. Other interesting things like you know
links to viewer poll results for example and that people can vote for
their favorite films in 2007. Or maybe some of their, the farewell
messages where people discussed their favorite films on the web forum through
their message board. Now if we go back to the web archive homepage here, one thing that you need to bear in mind is quite often when you’re
creating a new search we need to clear everything that we searched before. But
you can further control your results by using the advanced search tool here. We
can specify key words and phrases and limit your results by the domain, ie
the last parts of the web address by the date or by the file format type. Just to
go over some of these functions, the snapshot date, the format that we use is
we start with the year and then the two digits that correspond with the month
and then the day. And so we have the beginning date here and we have the end
date here. And you can also limit your results by file formats. So if you’re
interested in finding PDFs or PowerPoint presentations then this is a really good
tool for limiting and filtering those results. So for example if I wanted to
look at past federal budgets in Australia, I could search for the phrase
“federal budget” and limit the date to a specific year, let’s just say 2008. So
because it’s just the year 2008, I’m putting 2008 in both the beginning and
end dates, and it actually fills out automatically from the first of the
first so the first of January, through to the 12th the 31st of December so
that covers the whole year of results. And if we look at our results, now it’s
quite a lot of stuff there. Hundred thirty two thousand seven hundred and
twenty-one, however we can limit the results say to a specific domain. So we
limit it to a Australian government web domain by clicking on this button here. Now we’re down to 40,000 snapshots. Or alternatively if we wanted to look at
other perspectives, we can change the domain that we’re looking at. So we’re not
we’re not going to look at government websites but instead we could search for
say news media responses. Here we’ve got a article about the federal budget in
2008 in the Sydney morning herald. And so you can see some of the ways the news
media reported on these events. Or alternatively we could change that
domain to and maybe look at how the federal budget of that of that year
affected say non-government organizations. And so we’ve got results
from various government organizations I guess having a look at the federal
budget and responding accordingly. To use another example, if we were to search the
web archive on content for climate change, and I’ll go back to the main search here, I’ll just do phrase search for “climate change”, without any filters. We get over 55
million snapshot results, which is far too many to read through. We can limit
that to web domain. And that brings it down to just under 22 anda half million. Or we can go to our advanced search. No as I
mentioned we have to reset some of these entries and so I’m clearing what we were
searching before. But we limit that to for the phrase “climate change”.
Now we’re looking at about 2.7 million results. Again we can filter down a bit
further and say we just want educational websites about climate change that were
published from 1996 to 1999. Right back to the very beginning of the
web archive. We click search. Now we’re down to little over eight and a half
thousand snapshots. And finally, if we want to do a bit of a, if we were a limit
by file type, say we just wanted educational resources from this period
that were saved as PDF publications in PDF. And that takes us through to various
PDF files in relation to climate change. Now this is been a relatively random
selection of results but it shows how we can really narrow down our results based
on a number of different variables. Considering that with a web archive time
is a huge factor, as in the archive picks up different snapshots from different
times, I highly recommend making the most of being able to narrow your results by
date. I know what you’re all thinking why can’t we just Google this. Wouldn’t that
be so much easier. So I wanted to demonstrate what happens sometimes we
we’re trying Google for these kind of things. And to use our at the movies
example I will search for “at the movies ABC” to help the search string along. Oh look here we have at the movies website in our
results. That’s all looking pretty promising and
if I click on visit website. Ok, ok so looks like the website has been taken
down or at least the address may have been changed. But for whatever reason we
can no longer find this content. And when you’re searching the web this happens
quite often, can happen quite often. It can be quite an annoying stumbling block,
with what we refer to as a dead link and it’s can be quite frustrating for a
researcher. It’s much like going to a library shelf only to find that the book
that you’re looking for, is no longer there. However if you have the specific URL or the website address, you can paste it and
copy it from the address bar, paste it into our web archive. And there it is. So
now when you are searching for something online and you get a dead link, the web
archive can be absolutely useful in finding that page back when it was still
available online. Now the web archive, the Australian web archive, is also a
fantastic tool for Australian biographical research, especially if
you’re doing research into a public figure that has had some presence online.
In the past 23 years, for example I recently saw an interview with
Australian composer actor and comedian Tim Minchin where he talked a little
about his lesser-known work that he created prior to his rise to success in
2005. So we can use the Australian web archive to look at the various things he
did back when he was relatively unknown. So if we go to the advanced search.
I am going to search for the phrase “Tim Minchin”. Make sure we spell it right, clear
the domain. Now I’m going to leave the from snapshot date blank,
it’s anything up to the end of 2004. I click search… We have 494 snapshots which the scheme
of things is relatively small. But it also creates a fascinating retrospective
on Tim Michin’s relatively unknown work, creating music be in theater, theatrical
productions and working in arts festivals. I had no idea for example in
2003 that he was featured on Triple J’s Unearthed,where he wrote a song where he
names his favorite artists as being the Beatles Ben Folds and high five. And if
you look through, you can click to different websites which have reviews
promotional material, even musical yeah, theatres and radio stations… To use another example, if we wanted to look back at the Commonwealth Games in
2006, search for the phrase “Commonwealth Games”. This time we will limit the
snapshot dates to 2006 and 2006. Search… Now I can see one up at the top there is
the official Melbourne 2006 Commonwealth Games website. What’s really interesting
is that you can also see how other organizations either contributed to the
Commonwealth Games or responded to the Commonwealth Games. And you can see a
whole array of different perspectives as they were promoted and reported on
through the website and through the wild wood, through the World Wide Web.
For example, we can look at Australia post for example, created
commemorative stamps that we can see here. To give one more example in early
2008, there was the apology for the Stolen Generation, to use a more
political topic. So if we’re looking at different perspectives and responses to
stolen generation speech, apology in February 2008 we can, we, we can limit our
results to 2008 02, which is February 2008. Again let’s check that everything else
is cleared. We can search and here we have quite a lot of quite, quite a lot of
information considering this is limited down to one month of the
Australian Internet, but we can see that it was a very hot topic that many
different organizations and government agencies and councils and commentators
responded to. And so we can look back there and view the different
perspectives of its time. Now the australian web archive is a fantastic
way of investigating australian life in a specific time and place. So one
fantastic site I recently found was the University of Technology Sydney’s
virtual open day. So I’m just going to go straight to the address because I know
it. We could also search for this, using keyword but I’ll save us
some time. And we here we have it, the virtual open day for the University of
Technology Sydney as it was presented back in 1998. And it won an award that
year for the best Australian tertiary education website, so this really
showcases the kind of technology people dealing with the time. But it also
showcases perspectives of different students describing in their own words
their life living in Sydney as a university student and their experiences
of studying at the University of Technology Sydney. Whilst it’s mostly
image and text based, the content is really rich in information about their
lives at the time. Another useful website that I recently discovered is Telstra
springboard, which provides a unique insight into Australian culture in the
late nineties, providing links to the prominent websites of the time… So go back to 1997 again. Here, what we have here is basically a categorized list of
selected Australian websites that were seen as being quite, I guess prominent,
at the time. One thing I quite enjoy is this is Australia. We can see a map of
Australia, it shows all its major cities and bear
in mind this was years before Google Maps came about, so it may seem quite
primitive to us but it’s, it would have been really interesting, especially for
people living and accessing this website from overseas. We have different
selections that, we have a selection of different websites of that, are related
to Australian culture. So we have in click here
libraries in Australia and we can see what the various Australian libraries
looked like back in the time. So I can click through site. Oh okay, so yeah we
already hitting a dead link. This happens quite often when you just researching on
the web archive, you sort of go down a strange rabbit hole of redirects and
sometimes working links. But the the beauty of using web browser
you can always use this back button to retrace our steps. And yeah. You can see
different library websites, we can see galleries. So we have Australian
galleries. Again bearing in mind that this is the australian web archive it’s
only going to include australian content. So unfortunately, if you want to click on
the Louvre you might get lucky but I suspect, yeah so we get a message saying
websites web page snapshot not found. And this is because it simply wasn’t saved
by the australian web archive because it’s a French website. Let me go back to
This is Australia. Click “about Australia”. and gives you the information about
government at the time. I’m always curious about what what’s going on in
the art, so we can see various different arts websites as well, so it was yeah ok.
Happens from time to time. Theatre Royal Tasmania. So it’s it really can
be a process of trial and error and it’s really not perfect.
As you might have noticed from time to time you’ll see what we call broken
links or broken images, where the image doesn’t load. So it’s not perfect. There
are limitations to what the australian web archive can do and it’s not uncommon
to see that websites are often incomplete, missing or not functioning
properly. Which can happen for various reasons. Ultimately the web archive is a
collection of individual snapshots or websites but those websites are often
not designed to operate as a single static page and so when it tries to draw
from other content, such as streaming media or images hosted elsewhere or try
to connect to an external database or redirect to sites that aren’t archived,
then you’re going to encounter some difficulties. Main thing is not to panic.
You haven’t broken the internet and you can either try to you know glean what
information you can from the text that is available,
or perhaps you use the snapshot remote to see if you can have better luck. So [typing]… For example if you are looking at Sydney
university, you can see that our snapshot has lost a lot of that kind of
information, that visual information. But if we click to a previous snapshot, nope… Yeah it’s, this is a classic example. We
can we can click to other areas and other other times where that information
was more readily available on the archive. So finally it’s a huge archive, I
feel like I’ve hardly even scratched the surface. But the best way to get used to
the web archive is to just go in there and explore. As I’ve demonstrated,
sometimes you can get really lucky and have quite a seamless experience with
the web archive. Sometimes it’s about trial and error and trying again and
going back to the main web page and searching with a few different, a few
different keywords few different time, time frames. It’s much like any kind of
research process really, it just looks different. But if you go back to the main
trove website, which is where I am, I recommend just exploring to start with.
You can see here that we have some curated collections by subject and we
have some featured lists of really interesting websites. So I suggest
picking a topic and finding a website and practicing using the snapshot remote
to go forwards and backwards, and then maybe just appreciate how much things
have changed over time. One of my favorite sites, yeah for example, iconic
Australian brands with sites like Arnott’s biscuits,
has changed substantially through the years. Back from 1998, when they won is an
award for being one of the best Australian websites of the year, through,
and then through the years you can see how they’ve changed. So maybe top just
2003 … and back to, and across to present-day.
Any second now. Yeah, just quite a journey so we have time some more questions RUBY: Excellent, so we do have quite a few questions being asked, so let me just… So
Tom asked, is there any harvesting of Australian websites from non .au domain
such as .com or .org? ANDREW: So going back to what going back to the
beginning, we draw from various sources so the .au you all of the domain
harvest is one that was more of an automated process, however we still have
curated content that comes through the Pandora archive. So we still have web, web
activists that are seeking out non .au content that is of significance to
Australia and they generally will contact those website owners and let
them know that we are interested in hosting them on the Australian web archive.
RUBY: Great and we had an anonymous person, sort of continuing on from
that, international sites that can end in .au so whether they’re collected as well?
For example, they gave the example of blogspot as that can be found in
numerous ways. ANDREW: I think blogspot is an interesting one because, from my understanding, off the top of my head, it’s a Google site that uses .au is
kind of a way of mirroring the, their .com version, and I guess
that the main way to look at, look for it is to see if it’s, is to search for it on
the trove Australian web archive. I know that generally speaking if the source, if
the source website is .au it’ll be harvested, however there may be
exceptions. I couldn’t say off the top of my head to be perfectly honest.
RUBY: that’s all good and I think it’s something like this a collection so large, it’s always
it’s worth exploring if you have a particular website in mind.
ANDREW: Absolutely. RUBY: Christopher asks, how does the archive play out with paywalls on news websites?
ANDREW: Interesting question, so we don’t we we have only harvested sites that are
publicly, if that were publicly available at the time. The other thing to bear in
mind is that up until recent years they were only harvesting a lot of sites once
a year, so it really depends on when the news report happened and whether it was
behind a paywall at the time that that site was harvested.
So again I would go search for it, if it’s there
great, if not then you might have to seek some other avenues for looking up news
articles such as eResources. yeah that’s RUBY: Yeah that’s it, the library does have quite a lot of
other resources as well if you do need to find something that might not be
in the web archive you can always reach out and contact us. the next question
comes from Jill, is there a standard slash threshold for preserving web
content? It was mentioned that content is preserved for their cultural values etc. I’m curious about how that is determined?
ANDREW: That’s an excellent question and I am not a curator so I couldn’t comment on
that, so I’ve been focusing on I guess the functionality, however if you
want to get in touch with us through our ask a librarian service we can certainly
direct you to some more information about that.
RUBY: Great. Tom asks, will the URL of archived sites pages remain stable eg for the purposes of reference
and footnotes? So I think that was when you are clicking on the item details and
referencing within the website. ANDREW: Okay, okay. So what, what we’ve been recommending is… There’s two ways of referencing content
that we’re looking at on the, on the web archive. So… yeah I’ve just pick the
random page. Generally speaking we would use the URL, this should be a permanent
link, and here, and in fact it’s actually also replicated down the bottom here. So
that’s that’s the URL that you would use. We also have citation information along
here which references that link. So that should be a permanent link that you can
refer to in your citations. RUBY: Great. Anna has a question, so they have used the
wayback machine to save web sites of candidates, political parties, members and
government and there’s a Chrome extension allowing that preservation and
throws up archive pages, pages expired, how do we know if such sites have been,
will be preserved and does the Australian web archive have such a
capture and recall function for expired sites and how deeply does that
preservation level go? So quite a big question yeah but yeah.. ANDREW: Yeah look, the answer off the top of my head is that at the moment we’ve, we’ve only we’ve developed this as a search platform, so… yeah I don’t believe that we have that level of functionality in terms of alerting us
to the status of websites. I guess the main difference between our search
and the wayback machine which is you know, they’re both at the core the same
thing but they have various different functionalities. The wayback machine has
you would know it covers content from all over the world but it doesn’t
necessarily have the same level of searching by keyword. You would need to
know the URL. That doesn’t really answer your question but again if you get in
touch with us through our Ask a Librarian service, we can definitely look into that
for you. RUBY: Excellent. So we’ve got two more questions we’ll get to before the
today’s session. Thank you very much for joining us and we’ve got two more to get
through and then we’ll finish up. Acacia ask a question is there a mobile version
of the trove website? If not is there any plans on creating one? Well the trove
websites designed to be viewable through, through a web interface and so we, it’s,
it’s designed so that that is the main way of viewing it through a web browser on a mobile device. There are various reasons for that which I won’t go into but it’s
optimized for web viewing. RUBY: Yeah so that’s definitely the best way to access it at the moment. And then Carol asks a question which I feel is a very big
question, we might not be able to answer today, but she simply asks are copyright
issues covered? ANDREW: Okay. Now it’s a very important issue and I guess as a library
we abide by the Australian Copyright Act, which is partially what’s made the web
archive available. Through that Act we guide
from the concept of legal deposit, which means that all published materials are
available through our services. Now, so we have a mission to provide access to
these websites through that legislation. It doesn’t mean that people can
necessarily just take and use content found on the web archive. For that kind
of permission we would still need to go back to the actual source, back to the
creator of that website. That’s a very simple response, as Ruby mentioned
copyright is very complex and detailed issue but again if you have any concerns
about copyright with the web archive then we can, get in touch with us and we
can have a look at your concerns. Oh and I just wanted to mention one other thing,
if you’re interested in you know just keeping a finger on the pulse or the
different kind of stuff that’s available through the web archive, through our
social media channels we are having a weekly feature called web archive Wednesday’s,
where we have a look at different gems that we found through the web archive.
RUBY: Excellent, thank you so that’s all the questions today. A big thank you to
Andrew for his presentation today and answering all those questions. Very good
effort. So we’re going to finish up today. As we’ve mentioned once or twice if you
do have any more follow-up questions, we do here at the National Library have our
ask librarian service so that’s really good if you have any research questions
and things like that. And trove also has a contact us form as
well if you have probably, possibly have more technical issues around the web
archive, I’d recommend you ask them through there. But thank you all for
joining us today. Here are some of our upcoming sessions.
As I mentioned at the start of today’s session, we will be sending out an email
tomorrow with the recording of today’s webinar including the Q&As at the end,
as well as that survey as well if you weren’t able to get to that survey link
in the chat. But other than that that’s all from us.
Thank you so much for joining us and have a good rest of the day.

Leave a Reply

Your email address will not be published. Required fields are marked *