Bulk Loading Data into Cassandra Using SSTableLoader
Why
Use SSTableLoader:
When you want to move the data from
any database to Cassandra database the best option is SSTableloader in Cassandra.
By using this we can transfer the data very fast.
Steps to loading the data into Cassandra:
Step1: Creating Keyspace
CREATE KEYSPACE sample WITH
REPLICATION = {‘class’ : 'SimpleStrategy', 'replication_factor' : 1 };
Step 2: Creating table based on your
requirement .
CREATE TABLE sample.users (
key uuid,
firstname ascii,
lastname ascii,
password ascii,
age ascii,
email ascii,
PRIMARY KEY (key, firstname));
In the above i am creating table users .Primary keys are key and firstname.
In the above i am creating table users .Primary keys are key and firstname.
Step 3:
Creating the .csv based on your table.
How to create CSV file using Java:
Sample program to create CsvFile:
import java.io.FileWriter;
public class CreateCsv {
public static void main(String[] args) {
generateCsvFile("E:/csv/records.csv");
}
public static void generateCsvFile(String csvName) {
try {
FileWriter writer = new FileWriter(csvName);
for (int i = 0; i < 1000000; i++) {
writer.append(Integer.toString(i));
writer.append(',');
writer.append("26");
writer.append('\n');
}
writer.flush();
writer.close();
System.out.println("Success");
} catch (Exception e) {
e.printStackTrace();
}
}
}
These are mandatory steps after the create project for sstableloader
· In the project to upload
the all the jars of Cassandra. These jars all are available in lib folder and
tools folder of Cassandra tar or zip file provided by the Datastax.
·
And also upload the Cassandra.yaml file of conf folder in Cassandra tar or zip file of
Datastax.
·
And also upload the .csv file to the project.For example I put the sstable.csv in
my project.
Step 4:
Creating the data for sstableloader
using java program.
package com.cassandra.ramu;
import static org.apache.cassandra.utils.ByteBufferUtil.bytes;
import static org.apache.cassandra.utils.UUIDGen.decompose;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
import
org.apache.cassandra.db.marshal.AbstractType;
import
org.apache.cassandra.db.marshal.AsciiType;
import
org.apache.cassandra.db.marshal.CompositeType;
import
org.apache.cassandra.db.marshal.CompositeType.Builder;
import org.apache.cassandra.db.marshal.UUIDType;
import
org.apache.cassandra.dht.Murmur3Partitioner;
import
org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter;
public class SStableBuilder {
static String csvfilename = "sstable.csv";
public static void main(String[] args) {
try {
buildSStables();
} catch (Exception e) {
// TODO
Auto-generated catch block
e.printStackTrace();
}
}
public static void buildSStables() throws Exception {
String keyspace = "sample";
String table = "users";
File directory = new File(keyspace + "/" + table);
if (!directory.exists()) {
directory.mkdirs();
}
List<AbstractType<?>>
compositeColumnValues = new ArrayList<AbstractType<?>>();
compositeColumnValues.add(AsciiType.instance);
compositeColumnValues.add(AsciiType.instance);
CompositeType compositeColumn =
CompositeType.getInstance(compositeColumnValues);
SSTableSimpleUnsortedWriter
bulkWriter = new
SSTableSimpleUnsortedWriter(
directory, new Murmur3Partitioner(), keyspace, table,
compositeColumn, null, 64);
// Create a
single timestamp for each insert
long timestamp = System.currentTimeMillis() * 1000;
BufferedReader reader = new BufferedReader(new FileReader(csvfilename));
String line;
int lineNumber = 1;
CsvEntry entry = new CsvEntry();
while ((line = reader.readLine()) != null) {
if (entry.parse(line, lineNumber)) {
ByteBuffer uuid =
ByteBuffer.wrap(decompose(entry.key));
bulkWriter.newRow(uuid);
Builder builder =
compositeColumn.builder();
builder.add(bytes(entry.firstname));
builder.add(bytes("firstname"));
bulkWriter.addColumn(builder.build(),
bytes(entry.firstname), timestamp);
builder =
compositeColumn.builder();
builder.add(bytes(entry.firstname));
builder.add(bytes("lastname"));
bulkWriter.addColumn(builder.build(),
bytes(entry.lastname), timestamp);
builder =
compositeColumn.builder();
builder.add(bytes(entry.firstname));
builder.add(bytes("password"));
bulkWriter.addColumn(builder.build(),
bytes(entry.password), timestamp);
builder =
compositeColumn.builder();
builder.add(bytes(entry.firstname));
builder.add(bytes("age"));
bulkWriter.addColumn(builder.build(),
bytes(String.valueOf(entry.age)), timestamp);
builder =
compositeColumn.builder();
builder.add(bytes(entry.firstname));
builder.add(bytes("email"));
bulkWriter.addColumn(builder.build(),
bytes(entry.email), timestamp);
}
lineNumber++;
}
reader.close();
System.out.println("Success");
bulkWriter.close();
System.exit(0);
}
static class CsvEntry {
UUID key;
String firstname;
String lastname;
String password;
long age;
String email;
boolean parse(String line, int lineNumber) {
// Ghetto csv parsing
String[] columns =
line.split(",");
if (columns.length != 6) {
System.out.println(String.format(
"Invalid input '%s' at line %d of %s", line,
lineNumber,
csvfilename));
return false;
}
try {
key = UUID.fromString(columns[0].trim());
firstname = columns[1].trim();
lastname = columns[2].trim();
password = columns[3].trim();
age = Long.parseLong(columns[4].trim());
email = columns[5].trim();
return true;
} catch (NumberFormatException e) {
System.out.println(String.format(
"Invalid number in input '%s' at line %d of %s", line,
lineNumber,
csvfilename));
return false;
}
}
}
}
In
the above SStableBuilder java program create a data
for loading the data into Cassandra.
After run the sstableBuilder.java program created data in the above format.
Step 5:
Run
the SSTABLELOADER command from the cmd.
CMD:
sstableloader -d 127.0.0.1 pathofaboveusers
for example above workspace location
in D drive .You can go to D drive then give the path upto users folder
In
the above I was declare in my Cassandra.yaml file put my IP.But If you declare cassandra.yaml file 127.0.0.1 you can put the 127.0.0.1.
Path is upto your folder where the data is created through sstablebuilder program
This comment has been removed by the author.
ReplyDeleteDoes the first column of the csv data being imported and the first column destination table have to be a uuid value for this example to work?
ReplyDeleteWhat about a csv file and cassandra table without uuid a column? Would you make the newRow call using the concatenated string values of the primary key columns?
Also, what is "decompose" here? Ie. in what java package is it defined?
Without UUID is also same but some changes are required.
ReplyDeleteYou need to define in CSV Entry class as fallows
String key;
key = columns[0].trim();
when you are reading the line of Csv in the code you need to change like as fallows
String sss=entry.key;
bulkWriter.newRow(ByteBuffer.wrap(sss.getBytes()));
Builder builder = compositeColumn.builder();
builder.add(bytes(entry.key)); //Here you need to give the primary key
builder.add(bytes("ts"));
bulkWriter.addColumn(builder.build(), bytes(String.valueOf(entry.ts)), timestamp);
How fast is this tool?
ReplyDeleteI have 160GB csv file to insert. How much time should I expect?
I tried compiling this with 3.0.4 cassandra and it gets an error:
ReplyDeleteSStableBuilder.java:18: error: SSTableSimpleUnsortedWriter is not public in org.apache.cassandra.io.sstabl
I am really happy with your blog because your article is very unique and powerful for new reader.
ReplyDeleteClick here:
selenium training in chennai
selenium training in bangalore
selenium training in Pune
selenium training in pune
Selenium Online Training
The best online roulette in which you played only here. best online roulette Play and win always and only with us.
ReplyDeleteСупер отличная гибкая світлодіодна стрічка на любой вкус и цвет, обычно покупаю в интернет магазине.
ReplyDeleteGood Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
ReplyDeleteSelenium Training in Electronic City
This comment has been removed by the author.
ReplyDeleteGood Post.
ReplyDeletedigital marketing & web development company in Bhopal
Your articles really impressed for me,because of all information so nice.angular 7 training in bangalore
ReplyDeleteThese provided information was really so nice,thanks for giving that post and the more skills to develop after refer that post.sap s4 hana simple finance training in bangalore
ReplyDeleteI gathered a lot of information through this article.Every example is easy to undestandable and explaining the logic easily.javascript training in bangalore
ReplyDeleteVery useful and information content has been shared out here, Thanks for sharing it.sap hr training in bangalore
ReplyDeleteThis is really an awesome post, thanks for it. Keep adding more information to this.html training in bangalore
ReplyDeletethank you so much for this nice information Article, Digitahanks for sharing your post with us.Real Time Experts training center bangalore
ReplyDeleteIt is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful . student review for Realtime Experts marathahalli bangalore
ReplyDeleteThis is amazing and really inspiring goal.Real Time Experts Training in Bangalore
ReplyDeleteI have read your blog its very attractive and impressive. I like it your blog.Real Time Experts Training in Bangalore center address bangalore
ReplyDeleteReally very happy to say, your post is very interesting to read. I never stop myself to say something about it. You’re doing a great job. Keep it up…
ReplyDeleteUpgrade your career Learn Oracle Training from industry experts gets complete hands on Training, Interview preparation, and Job Assistance at My Training Bangalore.
Great post!I am actually getting ready to across this information,i am very happy to this commands.Also great blog here with all of the valuable information you have.Well done,its a great knowledge. Amazon web services Training in Bangalore
ReplyDeleteThanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts.devops training
ReplyDeleteThank you for your post. This is excellent information. It is amazing and wonderful to visit your site. sap s4 hana training in bangalore
ReplyDeleteAwesome,Thank you so much for sharing such an awesome blog. sap fico training in bangalore
ReplyDeleteThanks for sharing this blog. This very important and informative blog. Python Training in Bangalore
ReplyDeleteI am happy for sharing on this blog its awesome blog I really impressed. thanks for sharing.
ReplyDeleteeTechno Soft Solutions offers the industry recognized Job Oriented Training in Bangalore that combines corporate training, online training, and classroom training effectively to fulfill the educational demands of the students worldwide.
Nice post I have been searching for a useful post like this on salesforce course details, it is highly helpful for me and I have a great experience with this
ReplyDeleteSalesforce Training sydney
This is ansuperior writing service point that doesn't always sink in within the context of the classroom. In the first superior writing service paragraph you either hook the reader's interest or lose it. Of course your teacher, who's getting paid to teach you how to write an good essay,
ReplyDeleteSelenium Training in Electronic City
Wonderful bloggers like yourself who would positively reply encouraged me to be more open and engaging in commenting. So know it's helpful.
ReplyDeleteMicrosoft Azure Training in Electronic City
This comment has been removed by the author.
ReplyDeleteGreat post! I am actually getting ready to across this information, It’s very helpful for this blog. Also great with all of the valuable information you have Keep up the good work you are doing well.
ReplyDeleteCRS Info Solutions Salesforce Training
Nice post I have been searching for a useful post like this on salesforce course details, it is highly helpful for me and I have a great experience with this
ReplyDeleteSalesforce Training India
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThis amazing article i have ever read in recent times. This is very inforamtive article. I regularly visit this blog for this kind fo helpful posts. Thank you so much for this wonderful blog post, keep posting such helpful information. I was looking for a pega training institutes in pune whose instructor is really good at teaching. So you can either join at pega training institutes in Kolkata or pega training institutes in Bangalore in case if you are staying in Bengaluru. So start finding a job after a rigorous practice at pega training institutes in Mumbai whose faculty trainer the students at pega training institutes in Delhi also and in the end check out this pega interview questions. Once again thanks a lot for this wonderful blog article, your efforts are priceless.
ReplyDeleteI read regular your article. Is content is easy understand your blog. thank you so much.
ReplyDeletePython Training in Chennai
Python Training in Bangalore
Python Training in Hyderabad
Python Training in Coimbatore
Python Training
python online training
python flask training
python flask online training
Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging
ReplyDeletejava training in chennai
java training in omr
aws training in chennai
aws training in omr
python training in chennai
python training in omr
selenium training in chennai
selenium training in omr
Thanks for your informative article,Your post helped me to understand the future and career prospects & Keep on updating your blog with such awesome article.
ReplyDeleteangular js training in chennai
angular js training in porur
full stack training in chennai
full stack training in porur
php training in chennai
php training in porur
photoshop training in chennai
photoshop training in porur
Its such as you learn my mind! You appeаr tо grasp ѕo much approximately this, such as you wrote the book in it or something.
ReplyDeleteI think that you could ɗo wіth some percent to pressure the mesѕage home a little bit,
but instead of that, this iѕ excellent blog. An excellent
read. I ԝilⅼ defіnitely be back.
java training in chennai
java training in velachery
aws training in chennai
aws training in velachery
python training in chennai
python training in velachery
selenium training in chennai
selenium training in velachery
Thanks for one marvelous posting! I enjoyed reading it; you are a great author.
ReplyDeletehardware and networking training in chennai
hardware and networking training in annanagar
xamarin training in chennai
xamarin training in annanagar
ios training in chennai
ios training in annanagar
iot training in chennai
iot training in annanagar
I have been searching for a useful post like this on salesforce course details, it is highly helpful for me and I have a great experience with this Salesforce Training who are providing certification and job assistance.
ReplyDeleteSalesforce training Hyderabad
It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful
ReplyDeletedata science training in chennai
data science training in annanagar
android training in chennai
android training in annanagar
devops training in chennai
devops training in annanagar
artificial intelligence training in chennai
artificial intelligence training in annanagar
I like the helpful info you supply in your articles. I’ll bookmark your weblog and take a look at once more here regularly. I am relatively certain I will learn a lot of new stuff right here! Good luck for the following!
ReplyDeleteoracle training in chennai
oracle training in velachery
oracle dba training in chennai
oracle dba training in velachery
ccna training in chennai
ccna training in velachery
seo training in chennai
seo training in velachery
mbilaldev
ReplyDeletembilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
mbilaldev
Great post! I am actually getting ready to across this information, It’s very helpful for this blog. Also great with all of the valuable information you have Keep up the good work you are doing well.
ReplyDeleteCRS Info Solutions Salesforce training for beginners
Great blog thanks fro this infromation.
ReplyDeleteacte reviews
acte velachery reviews
acte tambaram reviews
acte anna nagar reviews
acte porur reviews
acte omr reviews
acte chennai reviews
acte student reviews
Good Post! it was so good to read and useful to improve my knowledge as an updated one, keep blogging. After seeing your article I want to say that also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.
ReplyDeleteAngularJS Training in Pune
Good Post!, it was so good to read and useful to improve my knowledge as an updated one, keep blogging. After seeing your article I want to say that also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.
ReplyDeleteSelenium Online Training
This naturally lends itself to the recommended batching of data to SalesForce's servers, and the number of calls to do that is both predictable and low. However, there is a time lag between actual activities in the LMS and the transfer of their resulting data to the CRM application. Salesforce training in Chennai
ReplyDeleteWow! Such an amazing and helpful post this is. I really really love it. I hope that you continue to do your work like this in the future also.
ReplyDeleteBig Data Training Institute in Pune
Hadoop Training in Pune
Usually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteUsually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteUsually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteUsually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteUsually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteUsually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man learn pega Online Training
ReplyDeleteI would like to thank you for the efforts you have made in writing this article, Its good and Informative.
ReplyDeletepega cpba
pega cpba training
I would like to thank you for the efforts you have made in writing this article, Its good and Informative.
ReplyDeletepega cpba
pega cpba training
Thank you for sharing this post.
ReplyDeleteData Science Online Training
Python Online Training
Salesforce Online Training
Thanks for sharing information to our knowledge, it helps me plenty keep sharing…
ReplyDeletePython Training in Pune
Best Python Classes in Pune