Difference between revisions of "DSpace/C2/Batch-import-of-items/English"
(Created page with "'''Script : Batch import of Items''' '''Author : Pankaj Patil''' '''Keywords: SAFBuilder, Batch Import, Batch Import in DSpace, Item Submission, Dublin Core, Mapfile, Metada...") |
Nancyvarkey (Talk | contribs) |
||
Line 76: | Line 76: | ||
|| Slide: '''Simple Archive Format''' | || Slide: '''Simple Archive Format''' | ||
|| Let us see the structure for '''Simple Archive Format'''. | || Let us see the structure for '''Simple Archive Format'''. | ||
− | + | The '''archive''' contains one '''subdirectory''' per''' Item'''. | |
− | * Each '''subdirectory''' contains | + | * Each '''subdirectory''' contains '''dublin_core.xml '''file i.e. the '''Item's metadata''' |
* '''Files''' that come along with the '''Item '''and | * '''Files''' that come along with the '''Item '''and | ||
* The '''Contents '''file, which consists of a list of filenames. | * The '''Contents '''file, which consists of a list of filenames. | ||
Line 270: | Line 270: | ||
Email:''' dspace.u1@gmail.com ''' | Email:''' dspace.u1@gmail.com ''' | ||
Password: '''u1pass''' | Password: '''u1pass''' | ||
− | || Login to '''DSpace '''with your '''administrator authority'''. | + | || '''Login''' to '''DSpace '''with your '''administrator authority'''. |
− | I will | + | I will '''login''' with my '''administrator authority.''' |
|- | |- | ||
|| Click on the '''logged in''' tab | || Click on the '''logged in''' tab | ||
Line 310: | Line 310: | ||
|- | |- | ||
|| Select the '''Articles Collection ''' | || Select the '''Articles Collection ''' | ||
− | || From the list, select the '''Collection '''into which we want to upload the | + | || From the list, select the '''Collection '''into which we want to upload the ''' Items.''' |
I will select '''Articles.''' | I will select '''Articles.''' | ||
|- | |- | ||
Line 321: | Line 321: | ||
|- | |- | ||
|| Point to notification | || Point to notification | ||
− | || A message is displayed, “'''The job was taken over, an email will be sent as soon as | + | || A message is displayed, “'''The job was taken over, an email will be sent as soon as it is finished'''”. |
|- | |- | ||
|| Only Narration | || Only Narration | ||
Line 417: | Line 417: | ||
|- | |- | ||
|| Narration Only | || Narration Only | ||
− | || So now we have verified the email notification of successful completion of '''batch import'''. | + | || So now we have verified the '''email''' notification of successful completion of '''batch import'''. |
|- | |- | ||
|| Switch back to '''DSpace''' | || Switch back to '''DSpace''' |
Latest revision as of 03:13, 11 September 2020
Script : Batch import of Items
Author : Pankaj Patil
Keywords: SAFBuilder, Batch Import, Batch Import in DSpace, Item Submission, Dublin Core, Mapfile, Metadata Spreadsheet, Mapfile, Batch Job, Spoken Tutorial, Video Tutorial
Visual Cue | Narration |
Slide: Title | Welcome to this spoken tutorial on Batch import of Items. |
Slide: Learning Objectives | In this tutorial, we will learn to
|
Slide: System requirements | This tutorial is recorded using
However you may use any other web browser and text Editor of your choice. |
Slide: Pre-requisites | To practice this tutorial, you should have
|
Slide: Pre-requisites |
|
Slide: Pre-requisites | To follow this tutorial, you should have
|
Slide : Code files |
|
Slide: Batch Import Feature |
|
Slide: Batch Import |
|
Slide: Batch Import Methodology |
|
Slide: Batch Import using SAFBuilder |
|
Slide: Setting up SAFBuilder |
|
Slide: Simple Archive Format | Let us see the structure for Simple Archive Format.
The archive contains one subdirectory per Item.
|
Narration Only | Let us proceed to set up the SAFBuilder tool. |
Press Ctrl+Alt+T keys | Open the terminal by pressing Ctrl + Alt + T keys simultaneously on the keyboard.
Ensure that you have root permissions to run the commands. |
Only Narration | Here onwards please press the Enter key after typing each command. |
Highlight user spoken | I will set up the SAFBuilder tool on the user spoken on my machine. |
Only Narration | git, jdk and maven tools are used to download and compile SAFBuilder, respectively.
These are already installed during DSpace installation. |
[Terminal]: Type | Type the following command to download SAFBuilder from DSpace-Labs repository in github.
The SAFBuilder download may take some time depending on your internet speed.
|
[Terminal]: Type
cd SAFBuilder |
Now type cd SAFBuilder to change the current directory to SAFBuilder directory. |
[Terminal]: Type
./safbuilder.sh |
Then type ./safbuilder.sh to compile SAFBuilder.
Compilation of SAFBuilder has started and it may take some time to complete. If SAFBuilder compilation fails, then recheck your internet connection or try after some time. |
Narration only | The SAFBuilder compilation is now successful. |
Narration only | Next, let us proceed to create a Simple Archive Format of different Items. |
[Terminal]: Type
cd $HOME |
Type cd $HOME, to change the current directory to Home directory. |
[Terminal]: Type
mkdir ItemUpload |
Now let us create a directory to store the metadata spreadsheet and all the files to be uploaded.
I will create a directory named ItemUpload. To do so, type mkdir ItemUpload. |
Narration only | I have downloaded an article and its metadata files in my Downloads folder.
These are provided in the Code Files link. |
[Terminal]: Type
cd $HOME/Downloads |
Using cd command we will switch to the Downloads directory. |
[Terminal]: Type ls | Type ls to check the contents. |
Highlight Article1.pdf and Article2.pdf | For this demonstration, we will use Article1.pdf and Article2.pdf files for the batch upload. |
[Terminal]: Type
cp Article1.pdf Article2.pdf $HOME/ItemUpload |
So, copy the files Article1.pdf and Article2.pdf to ItemUpload directory with this command. |
Narration only | Now, let us proceed to check the metadata spreadsheet file. |
Text on screen: In Windows OS, select Open with Excel | Right-click on Article1-2metadata.csv and select Open with LibreOffice Calc as shown here. |
Point to dialog boxText Import | The Text Import dialog box appears, which have some settings in which to import the text. |
Click the button OK | Keep the default settings and click on the OK button. |
Point to Article1-2metadata.csv | Article1-2metadata.csv opens as a spreadsheet. |
Point to header row | Observe that the first row is the header row. |
Point to columns of header row | Each column of the header row is a Dublin Core element.
It corresponds to each field in the Item Submission form. |
Narration only | It is mandatory to use header names strictly as provided in this metadata spreadsheet. |
Point to column filename | The first column is filename, which is used to write the name of the file to be uploaded. |
Narration only | The next columns sequentially represent each field in the Item Submission form. |
Point to dc.contributor.author | For example, the first field in Item submission form is Authors.
The corresponding column for Author in the spreadsheet is dc.contributor.author |
Point to multiple columns of
dc.contributor.author |
Author is a multi-value field in Item submission form.
So, multiple columns are represented as dc.contributor.author |
Narration only | The number of columns can be added or removed from multi-value fields in the Item Submission form. |
Narration only | Similarly, we have columns for other fields in the Item Submission form. |
Highlight dc.title,dc.date.issued,dc.publisher,etc | Single-value fields in the Item Submission form are represented using a single column. |
Highlight dc.identifier.ismn, dc.identifier.issn, dc.identifier.isbn | Multi-value fields in the Item Submission form are represented using multiple columns. |
Point to row of Article1.pdf | Each row in the spreadsheet represents a separate Item. |
Point to row of Article1.pdf and Article2.pdf | For each Item to be uploaded, write the filename and metadata as shown. |
Point to Article1-2metadata.csv | The metadata spreadsheet should be saved in CSV format only. |
Close the Calc | Close the Calc window now. |
Switch back to terminal | Switch back to the terminal. |
[Terminal]: Type
cp Article1-2metadata.csv $HOME/ItemUpload |
Now, copy the metadata spreadsheet file to the ItemUpload directory.
To do so, type the command as shown here. |
[Terminal] : Type
cd $HOME/ItemUpload |
Now let us check the contents of the ItemUpload directory.
Using cd command, change current working directory to the ItemUpload directory. |
[Terminal] : Type ls | Then type ls. |
Highlight Article1.pdf, Article2.pdf, Article1-2metadata.csv | ItemUpload directory has the files to be uploaded and their metadata in a CSV file.
i.e. Article1.pdf, Article2.pdf and Article1-2metadata.csv |
Narration only | Now, let us proceed to create a zip file in the SAF format.. |
[Terminal]: Type
cd $HOME/SAFBuilder |
Type this command to change the present working directory to SAFBuilder. |
[Terminal]: Type
./safbuilder.sh -c $HOME/ItemUpload/Article1-2metadata.csv -z |
Type the next command as shown, to prepare a Simple Archive file.
The Simple Archive file creation is successful. |
Narration only | The Simple Archive file in zip format will be created in the directory of the metadata spreadsheet.
In our case it is ItemUpload. |
[Terminal] : Type
cd $HOME/ItemUpload |
Type this command to change the current directory to ItemUpload directory. |
[Terminal] : Type ls | Type the command ls to get a list of files in ItemUpload directory. |
Highlight SimpleArchiveFormat.zip | SimpleArchiveFormat.zip is seen here. |
Narration only | Now, let us proceed to upload the zip file for batch import in DSpace. |
Web browser >> Address bar >> localhost:8080 | Open the DSpace interface. |
Log into DSpace with admin role
Email: dspace.u1@gmail.com Password: u1pass |
Login to DSpace with your administrator authority.
I will login with my administrator authority. |
Click on the logged in tab | Click on the Logged in tab at the top right corner. |
Select Administer | Select Administer from the drop-down. |
Click on Content tab | Click on the Content tab in the Navigation bar. |
Select Batch import | From the drop-down, select Batch import. |
Point to Batch import | The Batch import page opens. |
Point to Select the type of the input data | Select type of the input data, Simple Archive Format (zip file via upload) is selected by default. |
click on Browse button | To upload the SAF zip file, click on the Browse button in the Select data file to upload field. |
Point to File Upload | The File Upload dialog box opens up. |
Select SimpleArchiveFormat.zip | Browse and select the file SimpleArchiveFormat.zip |
Click Open button | Then, click on the Open button. |
Point to SimpleArchiveFormat.zip | On success, the name of the file is displayed next to the Browse button. |
Click on Select Collection drop down | Click on Select the owning collection of the items drop-down. |
Select the Articles Collection | From the list, select the Collection into which we want to upload the Items.
I will select Articles. |
Point to Select other collections that the items will belong to | Select other collections that the items will belong to field appears above the Upload button.
For this demonstration, I’m not selecting any other Collection. |
Click Upload button | Click on the Upload button at the bottom of the page. |
Point to notification | A message is displayed, “The job was taken over, an email will be sent as soon as it is finished”. |
Only Narration | We can see the progress of the batch import in the My DSpace page. |
Click on My DSpace link | Click on My DSpace link. |
Point to Batch imports | In the Batch Imports section we can see the timestamp of the job submission for the batch import. |
Point to success | Also we can see the success status for this batch import job. |
click on show more link | To view more details, click on Show more link next to the timestamp of the job submission. |
Point to Items to be imported and Items imported | Items to be imported and Items imported along with the number of items is displayed. |
click on the link show items | To view the imported Items, click on the link Show items next to the label Items imported. |
Narration Only | Mapfile contains mapping of Items uploaded in the batch and their handle numbers. |
Point to the Download mapfile button | Download the mapfile by clicking on the Download mapfile button below the label Items imported. |
Narration Only | We can also delete the Items that were uploaded using batch import. |
Point to Delete uploaded items & remove imports button | To do so, use the button Delete uploaded items & remove import, next to Download mapfile button. |
Narration Only | For this demonstration, I will not be deleting the Items imported as a batch. |
Narration only | Now, let us cross-verify the batch import of the Items in the Collection. |
click the Browse tab | To do so, click the Browse tab in the Navigation bar. |
Select Communities and Collections | Then click on Communities and Collections from the drop-down. |
Select the Article Collection | Select the Articles Collection. |
Point to Collection Home Page | The Collection Home Page appears. |
Scroll down | Scroll down to locate the Items we imported in the Collection. |
Point to Items | We can see that the Items submitted using batch import are successfully uploaded here. |
Point to Items | Sometimes, Items uploaded in a batch may take some time to appear in the Collection. |
Select the first Item | Select the first Item submitted using batch import. |
Point to metadata and file | We can see the metadata of the Item and its file. |
Narration only | This means, we have successfully uploaded Items using SAFBuilder and Batch import. |
Narration only | On success of the Batch Import job, an email notification is also sent to the Administrator.
Let us proceed to cross-verify the email notification. |
Log into administrator’s email account
Email: dspace.u1@gmail.com Password: d$pace2019* |
Log in to your administrator's email account.
This is my administrator's email account. |
Point to mail
DSpace - Batch import successfully completed |
Here is the email with the subject DSpace - Batch import successfully completed |
Narration Only | If the email is not seen in the Inbox, then it could be in the SPAM folder.
Otherwise please recheck your internet connection. |
Open a mail | Let us open the email. |
Highlight mapfile path | The email contains a mapfile path. |
Narration Only | So now we have verified the email notification of successful completion of batch import. |
Switch back to DSpace | Switch back to DSpace. |
Logout from DSpace | Let us logout from the DSpace interface. |
Only Narration | This brings us to the end of this tutorial.
Let us summarize. |
Slide: Summary | In this tutorial we learnt to
|
Slide: Assignment | As an assignment,
|
Slide : About Spoken Tutorial project | The video at the following link summarises the Spoken Tutorial project.
Please download and watch it. |
Slide : Spoken Tutorial workshops | The Spoken Tutorial Project team conducts workshops and gives certificates.
For more details, please write to us. |
Slide: Forums | Please post your timed queries in this Forum. |
Slide: Acknowledgement -I | Spoken Tutorial project is funded by MHRD, Government of India. |
Slide: Acknowledgement -II | DSpace spoken tutorial series is funded by the National Virtual Library of India, Ministry of Culture, Government of India. |
Narration only | This script and video for this tutorial was contributed by Pankaj Patil from IIT Bombay.
And this is Nancy Varkey signing off. Thanks for joining. |