Quantcast
Channel: MATLAB Central Newsreader Recent Posts
Viewing all articles
Browse latest Browse all 19628

Re: how to find unique rows and calculate the average.

$
0
0
On 4/11/2013 6:51 PM, dpb wrote:
...
>>> OK, it just dawned on me--your problem is trying to use RAW; ...
...
> That's what I was saying just realized--if it is an Excel spreadsheet
> that you're trying to read, TEXTSCAN() can't do it--it reads, well, 'text'.
>
> I was just pointing out that when you use XLSREAD that text that is the
> ID column, etc., is in the TXT cell array while the numeric values are
> in NUM. The TEXTSCAN line goes away; fid is the file handle obtained
> from fopen() on the file but xlsread takes the filename and opens it
> internally instead.
>
> I have to admit I've never used XLSREAD so I'm not absolutely positive
> how it structures the cell array(s) it returns--one thing that is clear
> is that you'll have to either start the location where it reads the file
> from the cell below the header line whichever line that is on inside the
> sheet or it will also be mixed up in the TXT array and I've no clue what
> it'll do about the NUM array to match.
>
> Alternatively, if TXT is a cell array of rows, it should be reasonably
> easy to use Bruno's solution on it--I have only recently been blessed w/
> a new-enough release of Matlab to incorporate accumarray and various
> other new-fangled things despite 20 yrs w/ Matlab. Consequently I tend
> to forget about them.
...

Well decided I'd try a new and old together...I took the lines you
posted and the header line and put them in a .xls file (something I
almost also never use, but that aside...) and named it Alan.xls

Then the following using Bruno's solution works just fine on the RAW
cella array if you specify the lines w/o the header on the XLSREAD call
(another reason to avoid Excel--having to know the sizes a priori)...

 >> [~,~,R]=xlsread('alan.xls',1,'A2:S6');
 >> [id, ~, J] = unique(R(:,8));
 >> [id num2cell(accumarray(J, [R{:,9}]', [], @mean))]
ans =
     '3BAB38287CYVA' [11]
     '3BAB38287GAAF' [64]
     '3BAB38287LUPA' [20]
     '3BAB38287PAPU' [35]
     '3BAB38287POLA' [79]
 >>

There's a mishmash if you don't exclude the header and particularly when
separating the text and numeric data--it's confused by the ID columns
that thinks maybe should have been valid hex or somesuch it
seems--they're returned as NaN's. Consequently it's another hassle to
figure out which column it is to operate on there but at least while
it's counting, you can rely on which is which in the RAW format.

Would be simpler it still seems to me to import a text file instead form
that standpoint albeit another step to save the xls file as csv or
something.

Anyway, should solve your problem...

--

Viewing all articles
Browse latest Browse all 19628

Trending Articles