# Need help with this code please

I just started using MATLAB and I need help with some code that I am trying to execute. What I have to do is take the iris.data file which is in the attachment, and read the data into a matrix. The last line is a string which gives a name of a flower and depending on what name, it has to be changed to 0, 1, or 2 and update the matrix. Here is the code that I have right now but I am getting an error for line 16 that says "Index exceeds matrix dimensions."

If you can please help me to get this to work that would be great. thanks a lot.

[code]
fid = fopen('iris.data', 'rt');
if (fid == -1)
error(['The file cannoot be opened!']);
end

i = 1;
while feof(fid) == 0
%grab line from file
line = fgetl(fid);

% find all commas
clcq = find(line == ',');

% extract numbers
d = [];
d(1) = str2num(line(1 : q(1) - 1)); % separately do first element

% now handle all middle elements
for j = 2 :length(q)
d(j) = str2num(line(q(j - 1) + 1 : q(j) - 1));
end

x = line(q(length(q)) + 1 : length(line));

strrep(x,'Iris-setosa','0')
strrep(x,'Iris-versicolor','1')
strrep(x,'Iris-virginica','2')

% now handle the last element
d(length(d) + 1) = str2num(x);

% put vector into matrix
D(i, = d;

% increment matrix size
i = i + 1;

end

fclose(fid);

[/code]

• I can't run this code... what is "q"?

Anyway I think that the error refers to the line

d(1) = str2num(line(1 : q(1) - 1));

because you are trying to assign a vector to a "scalar" element of d.

Morover, you can't use the word "line" as a variable because it is also a Matlab function and it could give some problems.

• here is the updated code that should work better. I don't understand what I have to change to get this error to work.

The error I am getting is...
"""
??? In an assignment A(I) = B, the number of elements in B and
I must be the same.

Error in ==> get_iris2 at 31
d(length(d) + 1) = str2num(x);
"""

[code]
fid = fopen('iris.data', 'rt');
if (fid == -1)
error(['The file cannoot be opened!']);
end

i = 1;
while feof(fid) == 0
%grab line from file
line = fgetl(fid);

% find all commas
q = find(line == ',');

% extract numbers
d = [];
d(1) = str2num(line(1 : q(1) - 1)); % separately do first element

% now handle all middle elements
for j = 2 :length(q)
d(j) = str2num(line(q(j - 1) + 1 : q(j) - 1));
end

% now handle the last element
x = line(q(length(q)) + 1 : length(line));

strrep(x,'Iris-setosa','0')
strrep(x,'Iris-versicolor','1')
strrep(x,'Iris-virginica','2')

d(length(d) + 1) = str2num(x);

% put vector into matrix
D(i, = d;

% increment matrix size
i = i + 1;

end

fclose(fid);
[/code]
• At this line, in the first cicle, x='Iris-setosa'.
I can't understand why you try to convert this string to a number.
The function str2num convert a number in a string to a number:
str2num('5') returns 5. For a string whitout numbers it returns an empty cell []. For this reason it can't assign the empty cell to a vector element.

Maybe you use in a wrong way the lines:
strrep(x,'Iris-setosa','0')
strrep(x,'Iris-versicolor','1')
strrep(x,'Iris-virginica','2')
what is the sense of this lines?
Do you want to associate to x a different string number, '0' if x='Iris-setosa', 1 if x='Iris-versicolor, etc...?
• These lines are supposed to change the string such as "Iris-setosa" to the string "0" and same for the others except with the different numbers.

strrep(x,'Iris-setosa','0')
strrep(x,'Iris-versicolor','1')
strrep(x,'Iris-virginica','2')

So if you open up the data set as text, All the iris-setosa, iris-versicolor, and iris-virginica should be changed to the respected string of a number that is listed above.

That is why I use the str2num function because I want to change the "0", "1", and "2" strings to the actual number to put in the matrix.

Yes, x should change from 'iris-setosa' to '0' and so on and so forth for the others.

• The problem is here.

These lines don't change the value of x.
strrep(x,'Iris-setosa','0')
strrep(x,'Iris-versicolor','1')
strrep(x,'Iris-virginica','2')

Because x can assume only these values, you can do in this way:

switch x
case 'Iris-setosa'
d(length(d) + 1)=0;
case 'Iris-versicolor'
d(length(d) + 1)=1;
case 'Iris-virginica'
d(length(d) + 1)=2;
end

If I understand, in this way you solve your problem.

• Yes, that did work! Thank you very much for all of the help!
I appreciate it very much!

• I'm happy for have been helpful.
• How would I go about finding the euclidean distance between all of the points in the data set? After doing that and having the data points for the euclidean distance, how would I find the two closest points in that set?

Any suggestions will help. Thanks a lot!

There is an example of the euclidean distance equation in the code.
I don't think it works correctly though. Also, what would the vectors be? 1 to 150?

[code]
fid = fopen('iris.data', 'rt');
if (fid == -1)
error(['The file cannoot be opened!']);
end

i = 1;
while feof(fid) == 0
%grab line from file
line = fgetl(fid);

% find all commas
q = find(line == ',');

% extract numbers
d = [];
d(1) = str2num(line(1 : q(1) - 1)); % separately do first element

% now handle all middle elements
for j = 2 :length(q)
d(j) = str2num(line(q(j - 1) + 1 : q(j) - 1));
end

% now handle the last element
x = line(q(length(q)) + 1 : length(line));

%strrep(x,'Iris-setosa','0')
%strrep(x,'Iris-versicolor','1')
%strrep(x,'Iris-virginica','2')

switch x
case 'Iris-setosa'
d(length(d) + 1) = 0
case 'Iris-versicolor'
d(length(d) + 1) = 1
case 'Iris-virginica'
d(length(d) + 1) = 2
end

%d(length(d) + 1) = str2num(x);

% put vector into matrix
D(i, = d;

% increment matrix size
i = i + 1;

end
fclose(fid);

%%
% calculate Euclidean distance between vectors 1 and 10
% this was an example of an equation I found...
d = 0;
for i = 1 : size(D, 2) - 1
d = d + (D(1, i) - D(10, i)) ^ 2;
end
d = sqrt(d)
[/code]
• Firt of all, please, explain what you mean with "distance". Here is the first 4 lines of your matrix D.
What distance do you want? In your code you made only the distance between the values of the first and the 10th rows.
Is this what you really want?

Make a numerical example on your matrix so that I can understand what you mean.

5.1 3.5 1.4 0.2 0
4.9 3 1.4 0.2 0
4.7 3.2 1.3 0.2 0
4.6 3.1 1.5 0.2 0

However your code calculate only the last distance.
Remember, when you use a for cicle ad you are interested on all the values, to do in this way:

[code]
for i = 1 : size(D, 2) - 1
d[b][color=Red](i)[/color][/b] = d[b][color=Red](i)[/color][/b] + (D(1, i) - D(10, i)) ^ 2;
end
[/code]

• How about if instead, I make a function that contains the euclidean_distance code so that it goes over all of the compenents of x and y (x and y being vectors).

So if I take your code and modify it a little, it might look like the code section below?

I'm just not sure what to do with this part.
What would I put in place of the "?" and are the x and y in this equation below in the correct places?

The answer for the output should be this if you run it with the x and y given in the code: 7.3485 2.4495 2.2361

d = 0;
for i = 1 : size(x, 2) - 1
d(i) = d(i) + (x(1, i) - y(?, i)) ^ 2;

[code]

function d = euclidean_distance (x, y)

% This wouldn't be in the actual code
% Only here for testing purposes to test the equation below
x = [1 2 3; 4 5 6; 7 8 9]
y = [ 8 3 4; 5 4 6; 5 6 7]

d = 0;
for i = 1 : size(x, 2) - 1
d(i) = d(i) + (x(1, i) - y(10, i)) ^ 2;

end
d = sqrt(d)

return
[/code]

After this is executed, I will have another file that will find the two nearest points. So I believe I need two for loops so it goes over all of the data points and then compares it to all of the other data points. Also I would call the euclidean_distance function from above inside these loops somehow so that the equation is finding the nearest points of the output form the euclidean_distance function.
Is this clear... sorry if I'm not good at explaining.

This code might start of like the code below but has to call the euclidean function from above some how.
This code is of course only for finding the minimum of one vector so another for loop will have to be put in which would also call the euclidean function I believe

[code]

D = get_iris()

n = size(a); % variable "a" being a single vector
min = 0;

for i = 1 : 1 : n
if a(i) < m
m = a(i);
end
end
[/code]

Here is the working code for "get_iris"
The iris.data file should be in the attachment for the original post

[code]
function D = get_iris ()

fid = fopen('iris.data', 'rt');
if (fid == -1)
error(['The file cannoot be opened!']);
end

i = 1;
while feof(fid) == 0
%grab line from file
line = fgetl(fid);

% find all commas
q = find(line == ',');

% extract numbers
d = [];
d(1) = str2num(line(1 : q(1) - 1)); % separately do first element

% now handle all middle elements
for j = 2 :length(q)
d(j) = str2num(line(q(j - 1) + 1 : q(j) - 1));
end

% now handle the last element
x = line(q(length(q)) + 1 : length(line));

%strrep(x,'Iris-setosa','0')
%strrep(x,'Iris-versicolor','1')
%strrep(x,'Iris-virginica','2')

switch x
case 'Iris-setosa'
d(length(d) + 1) = 0
case 'Iris-versicolor'
d(length(d) + 1) = 1
case 'Iris-virginica'
d(length(d) + 1) = 2
end

%d(length(d) + 1) = str2num(x);

% put vector into matrix
D(i, = d;

% increment matrix size
i = i + 1;

end
fclose(fid);

return
[/code]
• Sorry, but I can't understand what you mean.
I can try with an example.
x=[1 1]
y=[1 2]
the euclidean distance is d=sqrt((x(1,1)-y(1,1))^2+(x(1,2)-y(1,2))^2);
in 3 dimension:
x1=[1 1 1]
y1=[1 1 2]
the euclidean distance is d=sqrt((x1(1,1)-y1(1,1))^2+(x1(1,2)-y1(1,2))^2+(x1(1,3)-y1(1,3))^2);
so, if x2 and y2 contain more than a vector:
x2 = [1 2 3; 4 5 6; 7 8 9]
y2 = [ 8 3 4; 5 4 6; 5 6 7]
What I understand is that you want calculate the distance between
[1 2 3] and [8 3 4]
[4 5 6] and [5 4 6]
[7 8 9] and [5 6 7]
so
[code]
for i=1:size(x1,1)
d(i)=sqrt((x1(i,1)-y1(i,1))^2+(x1(i,2)-y1(i,2))^2+(x1(i,3)-y1(i,3))^2);
end
[/code]
Now d contains the 3 distances. (is what you wanted?)

the minimum distance is simply the minimum value of d, so

[code]
min_dist=min(d);
[/code]

• hi frend,
THE CODE U GIVEN IS STILL HAVING A PROBLEM. IT WONT RUN PROPERLY. CAN U SUGGEST ME CODE FOR THIS
• This post has been deleted.
• This post has been deleted.