Assign rank to each row in a group in R
I have the following input dataframe and would like to first group it by Gene
and then arrange by descending Expression
. Once I've done that, I'd like to add a Rank
column that ranks each row per Gene
according to the Expression
value - so rows with higher Expression
per gene get ranked higher.
I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.
dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))
**INPUT (dat)**
Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671
**EXPECTED OUTPUT**
Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4
UPDATE
When trying:
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
I get:
Gene Sample Expression Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697
r dataframe ranking
add a comment |
I have the following input dataframe and would like to first group it by Gene
and then arrange by descending Expression
. Once I've done that, I'd like to add a Rank
column that ranks each row per Gene
according to the Expression
value - so rows with higher Expression
per gene get ranked higher.
I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.
dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))
**INPUT (dat)**
Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671
**EXPECTED OUTPUT**
Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4
UPDATE
When trying:
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
I get:
Gene Sample Expression Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697
r dataframe ranking
2
... %>% mutate(Rank = 1:n())
, this relies on yourarrange
ordering. Or using therank
function (which ranks low to high, so we need a negative)... %>% mutate(Rank = rank(-Expression))
, which just uses theExpression
values so it doesn't depend on the ordering. Alsorank
has several options for dealing with ties.
– Gregor
Nov 28 '18 at 17:07
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
2
From your edit it looks like you've loadedplyr
afterdplyr
and ignored the warnings. See this R-FAQ. Rundetach(package:plyr)
or specifydplyr::mutate
and it should work.
– Gregor
Nov 28 '18 at 17:34
add a comment |
I have the following input dataframe and would like to first group it by Gene
and then arrange by descending Expression
. Once I've done that, I'd like to add a Rank
column that ranks each row per Gene
according to the Expression
value - so rows with higher Expression
per gene get ranked higher.
I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.
dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))
**INPUT (dat)**
Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671
**EXPECTED OUTPUT**
Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4
UPDATE
When trying:
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
I get:
Gene Sample Expression Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697
r dataframe ranking
I have the following input dataframe and would like to first group it by Gene
and then arrange by descending Expression
. Once I've done that, I'd like to add a Rank
column that ranks each row per Gene
according to the Expression
value - so rows with higher Expression
per gene get ranked higher.
I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.
dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))
**INPUT (dat)**
Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671
**EXPECTED OUTPUT**
Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4
UPDATE
When trying:
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
I get:
Gene Sample Expression Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697
r dataframe ranking
r dataframe ranking
edited Nov 28 '18 at 17:20
claudiadast
asked Nov 28 '18 at 17:04
claudiadastclaudiadast
13711
13711
2
... %>% mutate(Rank = 1:n())
, this relies on yourarrange
ordering. Or using therank
function (which ranks low to high, so we need a negative)... %>% mutate(Rank = rank(-Expression))
, which just uses theExpression
values so it doesn't depend on the ordering. Alsorank
has several options for dealing with ties.
– Gregor
Nov 28 '18 at 17:07
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
2
From your edit it looks like you've loadedplyr
afterdplyr
and ignored the warnings. See this R-FAQ. Rundetach(package:plyr)
or specifydplyr::mutate
and it should work.
– Gregor
Nov 28 '18 at 17:34
add a comment |
2
... %>% mutate(Rank = 1:n())
, this relies on yourarrange
ordering. Or using therank
function (which ranks low to high, so we need a negative)... %>% mutate(Rank = rank(-Expression))
, which just uses theExpression
values so it doesn't depend on the ordering. Alsorank
has several options for dealing with ties.
– Gregor
Nov 28 '18 at 17:07
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
2
From your edit it looks like you've loadedplyr
afterdplyr
and ignored the warnings. See this R-FAQ. Rundetach(package:plyr)
or specifydplyr::mutate
and it should work.
– Gregor
Nov 28 '18 at 17:34
2
2
... %>% mutate(Rank = 1:n())
, this relies on your arrange
ordering. Or using the rank
function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression))
, which just uses the Expression
values so it doesn't depend on the ordering. Also rank
has several options for dealing with ties.– Gregor
Nov 28 '18 at 17:07
... %>% mutate(Rank = 1:n())
, this relies on your arrange
ordering. Or using the rank
function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression))
, which just uses the Expression
values so it doesn't depend on the ordering. Also rank
has several options for dealing with ties.– Gregor
Nov 28 '18 at 17:07
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
2
2
From your edit it looks like you've loaded
plyr
after dplyr
and ignored the warnings. See this R-FAQ. Run detach(package:plyr)
or specify dplyr::mutate
and it should work.– Gregor
Nov 28 '18 at 17:34
From your edit it looks like you've loaded
plyr
after dplyr
and ignored the warnings. See this R-FAQ. Run detach(package:plyr)
or specify dplyr::mutate
and it should work.– Gregor
Nov 28 '18 at 17:34
add a comment |
2 Answers
2
active
oldest
votes
We can use dense_rank
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
add a comment |
The following worked:
dplyr::mutate
dat_rank <- dat %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression)) %>%
mutate(Rank = 1:n())
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524636%2fassign-rank-to-each-row-in-a-group-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
We can use dense_rank
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
add a comment |
We can use dense_rank
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
add a comment |
We can use dense_rank
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
We can use dense_rank
dat %>%
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)
answered Nov 28 '18 at 17:07
akrunakrun
417k13205278
417k13205278
add a comment |
add a comment |
The following worked:
dplyr::mutate
dat_rank <- dat %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression)) %>%
mutate(Rank = 1:n())
add a comment |
The following worked:
dplyr::mutate
dat_rank <- dat %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression)) %>%
mutate(Rank = 1:n())
add a comment |
The following worked:
dplyr::mutate
dat_rank <- dat %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression)) %>%
mutate(Rank = 1:n())
The following worked:
dplyr::mutate
dat_rank <- dat %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression)) %>%
mutate(Rank = 1:n())
answered Nov 28 '18 at 17:42
claudiadastclaudiadast
13711
13711
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524636%2fassign-rank-to-each-row-in-a-group-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
... %>% mutate(Rank = 1:n())
, this relies on yourarrange
ordering. Or using therank
function (which ranks low to high, so we need a negative)... %>% mutate(Rank = rank(-Expression))
, which just uses theExpression
values so it doesn't depend on the ordering. Alsorank
has several options for dealing with ties.– Gregor
Nov 28 '18 at 17:07
@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.
– claudiadast
Nov 28 '18 at 17:32
2
From your edit it looks like you've loaded
plyr
afterdplyr
and ignored the warnings. See this R-FAQ. Rundetach(package:plyr)
or specifydplyr::mutate
and it should work.– Gregor
Nov 28 '18 at 17:34