How to Read a Csv File Scala
This commodity was published every bit a office of the Data Scientific discipline Blogathon.
Introduction
Scala is difficult to acquire, true, but it'due south worth the difficult work. Scala has much easier syntax and is besides more expressive. Scala codes are much curtailed than Java's and an engineer who can write short and expressive lawmaking while likewise making information technology a type-safe and high-operation application are be considered valuable.
In this project, that I made as a college project, we'll come across how to write in a .csv file using Scala, which we volition and then utilize to create a bones fruit detection Automobile Learning model.
Information Set
The data prepare that nosotros will employ can exist found here.
The data set contains four fruits – Apple, Mandarin, Orange, and Lemons. And we will allocate them,
solely on the basis of the given height, width, mass, and color score.
Although our dataset is already cleaned, if you lot wish to use a different dataset, make sure to make clean and preprocess the data using python or whatever other way you want, to go the maximum out of your data, while grooming the model.
Writing in the CSV file
For writing the CSV file, we'll use Scala's BufferedWriter,
FileWriter and csvWriter.
We need to import all the in a higher place files before moving forward to deciding a path and giving column headings to our file.
We have a few rows of our information to accept every bit input for the preparation dataset and to use it in writing our CSV file.
1. val out = new BufferedWriter(new FileWriter("D:/Academic/Assignments/Scala/Fruits.csv")) //this line volition locate the file in the said directory 2. val author = new CSVWriter(out) //this creates a csvWriter object for our file three. val FruitSchema=Array("fruit_label","fruit_name","fruit_subtype","mass","width","top", "color_score") // these are the schemas/headings of our csv file So nosotros create arrays of our dataset, according to our schema plan.
To write this data into our csv file, we need to add this code snippet,
1. var listOfRecords=Listing() // this creates a listing which holds our data 2. writer.writeAll(listOfRecords) // this adds our data into csv file 3. out.close() //closing the file
Whew, we got that right,
Creating a file using random data
Nosotros've created our CSV file using Scala. Although, there is ane more way to practise this, where nosotros generate our information randomly, using ranges which we can and then convert to lists.
Firstly, we import all the required libraries.
Then, we'll now create our lists and ranges, which will contain the data we need in our CSV file.
1. val widthList = Range.BigDecimal(5.8,9.half dozen,0.1).toList // BigDecimal(starting number, catastrophe number, step count) is used to accept bladder in range and the toList part converts this range to a list 2. val random = new Random() // this part is used to generate the data randomly
Now nosotros will put all this data in our CSV
ane.var listOfRecords = new ListBuffer[Assortment[Cord]]() // this buffer holds all our data two.listOfRecords += csvFields // this adds our schemas/headings iii. for(i<- 1 until 50){ listOfRecords+=Assortment(i.toString,nameList(random.nextInt(nameList.length)),massList(random.nextInt(massList.length)).toString(), widthList(random.nextInt(widthList.length)).toString(),heightList(random.nextInt(heightList.length)).toString(),colorList(random.nextInt(colorList.length)).toString()) } / /the loop that which adds data to buffer
I used the Vlookup function in excel to add together the fruit label.
This lawmaking generates the data purely randomly, so we need to be very careful before using it.
Creating Automobile Learning model
To build our model, nosotros'll apply Jupyter IDE of python.
I added a few more than rows of data in my first CSV file, to get more accurate results.
Let's get started, by importing the required libraries.
Now, it's always better to have all the CSV files and python files in the same folder, then that information technology's like shooting fish in a barrel for u.s.a. to code and organize and for python to find the file. Now, nosotros volition read in our CSV file.
We can also visualize the data using the seaborn library of python, to understand the data better. I, for one, take skipped information technology for at present.
Lets split our data into preparation and test data,
After splitting the data, permit'south check what model we can apply, I starting time tried using Decision Tree, as we accept a comparatively lesser data
But, nosotros tin can conspicuously run across that this model is overfitting, so we reject this. Now, permit's check for K Nearest Neighbour,
We can see that accuracy for both, training and test ready is pretty proficient, so nosotros can use this model, equally information technology is neither overfitting nor underfitting.
Let'southward fit our data in the KNN model and cheque for the best neighbor value.
After finding the perfect value, let'due south see the prediction score of our model,
Information technology's non the best, but because we took a small dataset for this project, this is quite nice, now, we'll finally plot the conclusion boundaries of our project.
And, we're done!
Conclusion
We accept now learned, how to create CSV files using Scala and the basics of Machine Learning!
Source: https://www.analyticsvidhya.com/blog/2020/11/writing-a-csv-file-with-scala-and-using-it-to-create-a-machine-learning-model/
0 Response to "How to Read a Csv File Scala"
Enregistrer un commentaire