Dictionary of Headers in R












1















Is there a way that I can keep a separate list of headers that basically acts like a dictionary that lists the descriptive header and then an easier to use short name for each header that I could call back and forth without needing to maintain the correct order of the columns? I'm not great with this but here is an example of what I was thinking:



Original Data Set



|---------------------|------------------|------------------|
| Descriptive A | Descriptive B | Descriptive C |
|---------------------|------------------|------------------|
| 12 | 34 | 25 |
|---------------------|------------------|------------------|


Dictionary of Headers



|---------------------|------------------|
| long_name | short_name |
|---------------------|------------------|
| Descriptive A | A |
|---------------------|------------------|
| Descriptive B | B |
|---------------------|------------------|
| Descriptive C | C |
|---------------------|------------------|


Then I could have a piece of code that calls on the short_name column of the dictionary to replace the long_name title of the headers with the short_name and then I would not have to rely on the position of headers.



I'm not sure if that is possible but I have a table with 180 columns (that's growing) and they all have descriptive names that don't translate well into R, so I thought this might be a solution that I could continue to add to as the data set grows.










share|improve this question























  • I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

    – r2evans
    Nov 27 '18 at 15:49


















1















Is there a way that I can keep a separate list of headers that basically acts like a dictionary that lists the descriptive header and then an easier to use short name for each header that I could call back and forth without needing to maintain the correct order of the columns? I'm not great with this but here is an example of what I was thinking:



Original Data Set



|---------------------|------------------|------------------|
| Descriptive A | Descriptive B | Descriptive C |
|---------------------|------------------|------------------|
| 12 | 34 | 25 |
|---------------------|------------------|------------------|


Dictionary of Headers



|---------------------|------------------|
| long_name | short_name |
|---------------------|------------------|
| Descriptive A | A |
|---------------------|------------------|
| Descriptive B | B |
|---------------------|------------------|
| Descriptive C | C |
|---------------------|------------------|


Then I could have a piece of code that calls on the short_name column of the dictionary to replace the long_name title of the headers with the short_name and then I would not have to rely on the position of headers.



I'm not sure if that is possible but I have a table with 180 columns (that's growing) and they all have descriptive names that don't translate well into R, so I thought this might be a solution that I could continue to add to as the data set grows.










share|improve this question























  • I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

    – r2evans
    Nov 27 '18 at 15:49
















1












1








1








Is there a way that I can keep a separate list of headers that basically acts like a dictionary that lists the descriptive header and then an easier to use short name for each header that I could call back and forth without needing to maintain the correct order of the columns? I'm not great with this but here is an example of what I was thinking:



Original Data Set



|---------------------|------------------|------------------|
| Descriptive A | Descriptive B | Descriptive C |
|---------------------|------------------|------------------|
| 12 | 34 | 25 |
|---------------------|------------------|------------------|


Dictionary of Headers



|---------------------|------------------|
| long_name | short_name |
|---------------------|------------------|
| Descriptive A | A |
|---------------------|------------------|
| Descriptive B | B |
|---------------------|------------------|
| Descriptive C | C |
|---------------------|------------------|


Then I could have a piece of code that calls on the short_name column of the dictionary to replace the long_name title of the headers with the short_name and then I would not have to rely on the position of headers.



I'm not sure if that is possible but I have a table with 180 columns (that's growing) and they all have descriptive names that don't translate well into R, so I thought this might be a solution that I could continue to add to as the data set grows.










share|improve this question














Is there a way that I can keep a separate list of headers that basically acts like a dictionary that lists the descriptive header and then an easier to use short name for each header that I could call back and forth without needing to maintain the correct order of the columns? I'm not great with this but here is an example of what I was thinking:



Original Data Set



|---------------------|------------------|------------------|
| Descriptive A | Descriptive B | Descriptive C |
|---------------------|------------------|------------------|
| 12 | 34 | 25 |
|---------------------|------------------|------------------|


Dictionary of Headers



|---------------------|------------------|
| long_name | short_name |
|---------------------|------------------|
| Descriptive A | A |
|---------------------|------------------|
| Descriptive B | B |
|---------------------|------------------|
| Descriptive C | C |
|---------------------|------------------|


Then I could have a piece of code that calls on the short_name column of the dictionary to replace the long_name title of the headers with the short_name and then I would not have to rely on the position of headers.



I'm not sure if that is possible but I have a table with 180 columns (that's growing) and they all have descriptive names that don't translate well into R, so I thought this might be a solution that I could continue to add to as the data set grows.







r






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 27 '18 at 15:39









thejuanaldthejuanald

83




83













  • I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

    – r2evans
    Nov 27 '18 at 15:49





















  • I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

    – r2evans
    Nov 27 '18 at 15:49



















I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

– r2evans
Nov 27 '18 at 15:49







I do not believe there is functionality in R to automatically allow aliases of column names ... in place, that is. You can always rename the columns pre-calc and then rename them back later. The default functionality of the $ operator does allow for partial matches, but I believe they always match (unambiguously) from the left, not from the right as your example portrays. You might try to rewrite $.data.frame so that either one would work, but you risk lots of corner-cases and unintended consequences when messing with that.

– r2evans
Nov 27 '18 at 15:49














4 Answers
4






active

oldest

votes


















0














Yes, you just need the dictionary (or codebook) as a separate data frame (can be read in from, say, a .csv file). Let's say you have a data frame like this:



df <- data.frame(matrix(rnorm(1000), ncol = 100))
names(df) <- paste0("a very long unfortunate name to be replaced_", 1:ncol(df))


You can create the codebook like this:



codebook <- data.frame(long_name = names(df), short_name = paste0("X_", 1:ncol(df)),
stringsAsFactors = F)

long_name short_name
1 a very long unfortunate name to be replaced_1 X_1
2 a very long unfortunate name to be replaced_2 X_2
3 a very long unfortunate name to be replaced_3 X_3
4 a very long unfortunate name to be replaced_4 X_4
5 a very long unfortunate name to be replaced_5 X_5
6 a very long unfortunate name to be replaced_6 X_6


Let's then change names of df using the "short names"



names(df) <- codebook[ ,2]


For fun, let's randomise the rows of codebook to show you cna still use it:



codebook <- codebook[sample(nrow(codebook)), ]


Finally you can use match() to retrieve the original long names:



codebook$long_name[match(names(df), codebook$short_name)]

[1] a very long unfortunate name to be replaced_1 a very long unfortunate name to be replaced_2
[3] a very long unfortunate name to be replaced_3 a very long unfortunate name to be replaced_4
[5] a very long unfortunate name to be replaced_5 a very long unfortunate name to be replaced_6
[7] a very long unfortunate name to be replaced_7 a very long unfortunate name to be replaced_8
[9] a very long unfortunate name to be replaced_9 a very long unfortunate name to be replaced_10





share|improve this answer
























  • Thank you that worked perfectly!

    – thejuanald
    Nov 27 '18 at 16:20



















1














Like I commented, I don't think there's a way to do aliasing in place, but for calculation you can do something like:



df1 <- data.frame(
"Descriptive A" = 12,
"Descriptive B" = 34,
"Descriptive C" = 25,
check.names = FALSE
)


The "aliasing" object can be a frame, but since all you're doing is assigning a name to a name, it is efficiently handled by a named character vector:



df1_aliases <- c(
"B" = "Descriptive B",
"A" = "Descriptive A",
"C" = "Descriptive C"
)


Your aliases steps would be an intentional pre-/post-translation of names:



names(df1) <- names(df1_aliases)[ match(names(df1), df1_aliases) ]
df1
# A B C
# 1 12 34 25

### do stuff here ###

names(df1) <- df1_aliases[ match(names(df1), names(df1_aliases)) ]
df1
# Descriptive A Descriptive B Descriptive C
# 1 12 34 25


It might be feasible to overwrite $.data.frame and $<-.data.frame for basic dollar-sign operations, but you'd also need to overwrite [.data.frame, [[.data.frame, and perhaps even with (depending on your frame-access habits) ... and those rewritten functions might not work from all other functions you are using (depending on their function/namespace search path).



Because of the complexities of tracking down everything that touches the frame, I strongly suggest you make it as explicit as possible: have only one set of names each column is known by (whether the original or your aliases), never both simultaneously. This means the translate/untranslate steps are explicit and anything that works on the frame will work unambiguously.






share|improve this answer































    1














    You could give the names names, and then subset the names before subsetting the data.frame.



    For example, using the iris data:



    short_names <- names(iris)
    names(short_names) <- c("sl","sw","pl","pw","sp")
    attributes(iris)$names <- short_names

    head(iris[names(iris)[c("sl","sp")]])
    Sepal.Length Species
    1 5.1 setosa
    2 4.9 setosa
    3 4.7 setosa
    4 4.6 setosa
    5 5.0 setosa
    6 5.4 setosa





    share|improve this answer
























    • This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

      – 42-
      Nov 27 '18 at 16:30













    • I like this answer a lot as well. Thank you!

      – thejuanald
      Nov 28 '18 at 16:48



















    0














    Using dict and DF defined reproducibly in the Note at the end run the for loop
    shown and then we can use A, B and C without quotes as column names.



    for(i in 1:nrow(dict)) assign(dict$short_name[i], dict$long_name[i])

    # test - use DF[B] in place of DF["Descriptive B"]
    DF[B]
    ## Descriptive B
    ## 1 34


    As shown in the above test it is straight forward when using conventional subscripting. If you want to use nonstandard evaluation such as in dplyr then you will need to use rlang in the usual way:



    library(dplyr)

    DF %>% mutate(D = !!sym(B))
    ## Descriptive A Descriptive B Descriptive C D
    ## 1 12 34 25 34


    Note



    We assume this input:



    Lines1 <- "
    long_name | short_name
    Descriptive A | A
    Descriptive B | B
    Descriptive C | C"
    dict <- read.table(text = Lines1, header = TRUE, sep = "|", as.is = TRUE,
    strip.white = TRUE)

    Lines2 <- "
    Descriptive A | Descriptive B | Descriptive C
    12 | 34 | 25"
    DF <- read.table(text = Lines2, header = TRUE, sep = "|", as.is = TRUE,
    strip.white = TRUE, check.names = FALSE)





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53503133%2fdictionary-of-headers-in-r%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      Yes, you just need the dictionary (or codebook) as a separate data frame (can be read in from, say, a .csv file). Let's say you have a data frame like this:



      df <- data.frame(matrix(rnorm(1000), ncol = 100))
      names(df) <- paste0("a very long unfortunate name to be replaced_", 1:ncol(df))


      You can create the codebook like this:



      codebook <- data.frame(long_name = names(df), short_name = paste0("X_", 1:ncol(df)),
      stringsAsFactors = F)

      long_name short_name
      1 a very long unfortunate name to be replaced_1 X_1
      2 a very long unfortunate name to be replaced_2 X_2
      3 a very long unfortunate name to be replaced_3 X_3
      4 a very long unfortunate name to be replaced_4 X_4
      5 a very long unfortunate name to be replaced_5 X_5
      6 a very long unfortunate name to be replaced_6 X_6


      Let's then change names of df using the "short names"



      names(df) <- codebook[ ,2]


      For fun, let's randomise the rows of codebook to show you cna still use it:



      codebook <- codebook[sample(nrow(codebook)), ]


      Finally you can use match() to retrieve the original long names:



      codebook$long_name[match(names(df), codebook$short_name)]

      [1] a very long unfortunate name to be replaced_1 a very long unfortunate name to be replaced_2
      [3] a very long unfortunate name to be replaced_3 a very long unfortunate name to be replaced_4
      [5] a very long unfortunate name to be replaced_5 a very long unfortunate name to be replaced_6
      [7] a very long unfortunate name to be replaced_7 a very long unfortunate name to be replaced_8
      [9] a very long unfortunate name to be replaced_9 a very long unfortunate name to be replaced_10





      share|improve this answer
























      • Thank you that worked perfectly!

        – thejuanald
        Nov 27 '18 at 16:20
















      0














      Yes, you just need the dictionary (or codebook) as a separate data frame (can be read in from, say, a .csv file). Let's say you have a data frame like this:



      df <- data.frame(matrix(rnorm(1000), ncol = 100))
      names(df) <- paste0("a very long unfortunate name to be replaced_", 1:ncol(df))


      You can create the codebook like this:



      codebook <- data.frame(long_name = names(df), short_name = paste0("X_", 1:ncol(df)),
      stringsAsFactors = F)

      long_name short_name
      1 a very long unfortunate name to be replaced_1 X_1
      2 a very long unfortunate name to be replaced_2 X_2
      3 a very long unfortunate name to be replaced_3 X_3
      4 a very long unfortunate name to be replaced_4 X_4
      5 a very long unfortunate name to be replaced_5 X_5
      6 a very long unfortunate name to be replaced_6 X_6


      Let's then change names of df using the "short names"



      names(df) <- codebook[ ,2]


      For fun, let's randomise the rows of codebook to show you cna still use it:



      codebook <- codebook[sample(nrow(codebook)), ]


      Finally you can use match() to retrieve the original long names:



      codebook$long_name[match(names(df), codebook$short_name)]

      [1] a very long unfortunate name to be replaced_1 a very long unfortunate name to be replaced_2
      [3] a very long unfortunate name to be replaced_3 a very long unfortunate name to be replaced_4
      [5] a very long unfortunate name to be replaced_5 a very long unfortunate name to be replaced_6
      [7] a very long unfortunate name to be replaced_7 a very long unfortunate name to be replaced_8
      [9] a very long unfortunate name to be replaced_9 a very long unfortunate name to be replaced_10





      share|improve this answer
























      • Thank you that worked perfectly!

        – thejuanald
        Nov 27 '18 at 16:20














      0












      0








      0







      Yes, you just need the dictionary (or codebook) as a separate data frame (can be read in from, say, a .csv file). Let's say you have a data frame like this:



      df <- data.frame(matrix(rnorm(1000), ncol = 100))
      names(df) <- paste0("a very long unfortunate name to be replaced_", 1:ncol(df))


      You can create the codebook like this:



      codebook <- data.frame(long_name = names(df), short_name = paste0("X_", 1:ncol(df)),
      stringsAsFactors = F)

      long_name short_name
      1 a very long unfortunate name to be replaced_1 X_1
      2 a very long unfortunate name to be replaced_2 X_2
      3 a very long unfortunate name to be replaced_3 X_3
      4 a very long unfortunate name to be replaced_4 X_4
      5 a very long unfortunate name to be replaced_5 X_5
      6 a very long unfortunate name to be replaced_6 X_6


      Let's then change names of df using the "short names"



      names(df) <- codebook[ ,2]


      For fun, let's randomise the rows of codebook to show you cna still use it:



      codebook <- codebook[sample(nrow(codebook)), ]


      Finally you can use match() to retrieve the original long names:



      codebook$long_name[match(names(df), codebook$short_name)]

      [1] a very long unfortunate name to be replaced_1 a very long unfortunate name to be replaced_2
      [3] a very long unfortunate name to be replaced_3 a very long unfortunate name to be replaced_4
      [5] a very long unfortunate name to be replaced_5 a very long unfortunate name to be replaced_6
      [7] a very long unfortunate name to be replaced_7 a very long unfortunate name to be replaced_8
      [9] a very long unfortunate name to be replaced_9 a very long unfortunate name to be replaced_10





      share|improve this answer













      Yes, you just need the dictionary (or codebook) as a separate data frame (can be read in from, say, a .csv file). Let's say you have a data frame like this:



      df <- data.frame(matrix(rnorm(1000), ncol = 100))
      names(df) <- paste0("a very long unfortunate name to be replaced_", 1:ncol(df))


      You can create the codebook like this:



      codebook <- data.frame(long_name = names(df), short_name = paste0("X_", 1:ncol(df)),
      stringsAsFactors = F)

      long_name short_name
      1 a very long unfortunate name to be replaced_1 X_1
      2 a very long unfortunate name to be replaced_2 X_2
      3 a very long unfortunate name to be replaced_3 X_3
      4 a very long unfortunate name to be replaced_4 X_4
      5 a very long unfortunate name to be replaced_5 X_5
      6 a very long unfortunate name to be replaced_6 X_6


      Let's then change names of df using the "short names"



      names(df) <- codebook[ ,2]


      For fun, let's randomise the rows of codebook to show you cna still use it:



      codebook <- codebook[sample(nrow(codebook)), ]


      Finally you can use match() to retrieve the original long names:



      codebook$long_name[match(names(df), codebook$short_name)]

      [1] a very long unfortunate name to be replaced_1 a very long unfortunate name to be replaced_2
      [3] a very long unfortunate name to be replaced_3 a very long unfortunate name to be replaced_4
      [5] a very long unfortunate name to be replaced_5 a very long unfortunate name to be replaced_6
      [7] a very long unfortunate name to be replaced_7 a very long unfortunate name to be replaced_8
      [9] a very long unfortunate name to be replaced_9 a very long unfortunate name to be replaced_10






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Nov 27 '18 at 16:01









      Milan ValášekMilan Valášek

      36319




      36319













      • Thank you that worked perfectly!

        – thejuanald
        Nov 27 '18 at 16:20



















      • Thank you that worked perfectly!

        – thejuanald
        Nov 27 '18 at 16:20

















      Thank you that worked perfectly!

      – thejuanald
      Nov 27 '18 at 16:20





      Thank you that worked perfectly!

      – thejuanald
      Nov 27 '18 at 16:20













      1














      Like I commented, I don't think there's a way to do aliasing in place, but for calculation you can do something like:



      df1 <- data.frame(
      "Descriptive A" = 12,
      "Descriptive B" = 34,
      "Descriptive C" = 25,
      check.names = FALSE
      )


      The "aliasing" object can be a frame, but since all you're doing is assigning a name to a name, it is efficiently handled by a named character vector:



      df1_aliases <- c(
      "B" = "Descriptive B",
      "A" = "Descriptive A",
      "C" = "Descriptive C"
      )


      Your aliases steps would be an intentional pre-/post-translation of names:



      names(df1) <- names(df1_aliases)[ match(names(df1), df1_aliases) ]
      df1
      # A B C
      # 1 12 34 25

      ### do stuff here ###

      names(df1) <- df1_aliases[ match(names(df1), names(df1_aliases)) ]
      df1
      # Descriptive A Descriptive B Descriptive C
      # 1 12 34 25


      It might be feasible to overwrite $.data.frame and $<-.data.frame for basic dollar-sign operations, but you'd also need to overwrite [.data.frame, [[.data.frame, and perhaps even with (depending on your frame-access habits) ... and those rewritten functions might not work from all other functions you are using (depending on their function/namespace search path).



      Because of the complexities of tracking down everything that touches the frame, I strongly suggest you make it as explicit as possible: have only one set of names each column is known by (whether the original or your aliases), never both simultaneously. This means the translate/untranslate steps are explicit and anything that works on the frame will work unambiguously.






      share|improve this answer




























        1














        Like I commented, I don't think there's a way to do aliasing in place, but for calculation you can do something like:



        df1 <- data.frame(
        "Descriptive A" = 12,
        "Descriptive B" = 34,
        "Descriptive C" = 25,
        check.names = FALSE
        )


        The "aliasing" object can be a frame, but since all you're doing is assigning a name to a name, it is efficiently handled by a named character vector:



        df1_aliases <- c(
        "B" = "Descriptive B",
        "A" = "Descriptive A",
        "C" = "Descriptive C"
        )


        Your aliases steps would be an intentional pre-/post-translation of names:



        names(df1) <- names(df1_aliases)[ match(names(df1), df1_aliases) ]
        df1
        # A B C
        # 1 12 34 25

        ### do stuff here ###

        names(df1) <- df1_aliases[ match(names(df1), names(df1_aliases)) ]
        df1
        # Descriptive A Descriptive B Descriptive C
        # 1 12 34 25


        It might be feasible to overwrite $.data.frame and $<-.data.frame for basic dollar-sign operations, but you'd also need to overwrite [.data.frame, [[.data.frame, and perhaps even with (depending on your frame-access habits) ... and those rewritten functions might not work from all other functions you are using (depending on their function/namespace search path).



        Because of the complexities of tracking down everything that touches the frame, I strongly suggest you make it as explicit as possible: have only one set of names each column is known by (whether the original or your aliases), never both simultaneously. This means the translate/untranslate steps are explicit and anything that works on the frame will work unambiguously.






        share|improve this answer


























          1












          1








          1







          Like I commented, I don't think there's a way to do aliasing in place, but for calculation you can do something like:



          df1 <- data.frame(
          "Descriptive A" = 12,
          "Descriptive B" = 34,
          "Descriptive C" = 25,
          check.names = FALSE
          )


          The "aliasing" object can be a frame, but since all you're doing is assigning a name to a name, it is efficiently handled by a named character vector:



          df1_aliases <- c(
          "B" = "Descriptive B",
          "A" = "Descriptive A",
          "C" = "Descriptive C"
          )


          Your aliases steps would be an intentional pre-/post-translation of names:



          names(df1) <- names(df1_aliases)[ match(names(df1), df1_aliases) ]
          df1
          # A B C
          # 1 12 34 25

          ### do stuff here ###

          names(df1) <- df1_aliases[ match(names(df1), names(df1_aliases)) ]
          df1
          # Descriptive A Descriptive B Descriptive C
          # 1 12 34 25


          It might be feasible to overwrite $.data.frame and $<-.data.frame for basic dollar-sign operations, but you'd also need to overwrite [.data.frame, [[.data.frame, and perhaps even with (depending on your frame-access habits) ... and those rewritten functions might not work from all other functions you are using (depending on their function/namespace search path).



          Because of the complexities of tracking down everything that touches the frame, I strongly suggest you make it as explicit as possible: have only one set of names each column is known by (whether the original or your aliases), never both simultaneously. This means the translate/untranslate steps are explicit and anything that works on the frame will work unambiguously.






          share|improve this answer













          Like I commented, I don't think there's a way to do aliasing in place, but for calculation you can do something like:



          df1 <- data.frame(
          "Descriptive A" = 12,
          "Descriptive B" = 34,
          "Descriptive C" = 25,
          check.names = FALSE
          )


          The "aliasing" object can be a frame, but since all you're doing is assigning a name to a name, it is efficiently handled by a named character vector:



          df1_aliases <- c(
          "B" = "Descriptive B",
          "A" = "Descriptive A",
          "C" = "Descriptive C"
          )


          Your aliases steps would be an intentional pre-/post-translation of names:



          names(df1) <- names(df1_aliases)[ match(names(df1), df1_aliases) ]
          df1
          # A B C
          # 1 12 34 25

          ### do stuff here ###

          names(df1) <- df1_aliases[ match(names(df1), names(df1_aliases)) ]
          df1
          # Descriptive A Descriptive B Descriptive C
          # 1 12 34 25


          It might be feasible to overwrite $.data.frame and $<-.data.frame for basic dollar-sign operations, but you'd also need to overwrite [.data.frame, [[.data.frame, and perhaps even with (depending on your frame-access habits) ... and those rewritten functions might not work from all other functions you are using (depending on their function/namespace search path).



          Because of the complexities of tracking down everything that touches the frame, I strongly suggest you make it as explicit as possible: have only one set of names each column is known by (whether the original or your aliases), never both simultaneously. This means the translate/untranslate steps are explicit and anything that works on the frame will work unambiguously.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 27 '18 at 16:01









          r2evansr2evans

          27.6k33159




          27.6k33159























              1














              You could give the names names, and then subset the names before subsetting the data.frame.



              For example, using the iris data:



              short_names <- names(iris)
              names(short_names) <- c("sl","sw","pl","pw","sp")
              attributes(iris)$names <- short_names

              head(iris[names(iris)[c("sl","sp")]])
              Sepal.Length Species
              1 5.1 setosa
              2 4.9 setosa
              3 4.7 setosa
              4 4.6 setosa
              5 5.0 setosa
              6 5.4 setosa





              share|improve this answer
























              • This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

                – 42-
                Nov 27 '18 at 16:30













              • I like this answer a lot as well. Thank you!

                – thejuanald
                Nov 28 '18 at 16:48
















              1














              You could give the names names, and then subset the names before subsetting the data.frame.



              For example, using the iris data:



              short_names <- names(iris)
              names(short_names) <- c("sl","sw","pl","pw","sp")
              attributes(iris)$names <- short_names

              head(iris[names(iris)[c("sl","sp")]])
              Sepal.Length Species
              1 5.1 setosa
              2 4.9 setosa
              3 4.7 setosa
              4 4.6 setosa
              5 5.0 setosa
              6 5.4 setosa





              share|improve this answer
























              • This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

                – 42-
                Nov 27 '18 at 16:30













              • I like this answer a lot as well. Thank you!

                – thejuanald
                Nov 28 '18 at 16:48














              1












              1








              1







              You could give the names names, and then subset the names before subsetting the data.frame.



              For example, using the iris data:



              short_names <- names(iris)
              names(short_names) <- c("sl","sw","pl","pw","sp")
              attributes(iris)$names <- short_names

              head(iris[names(iris)[c("sl","sp")]])
              Sepal.Length Species
              1 5.1 setosa
              2 4.9 setosa
              3 4.7 setosa
              4 4.6 setosa
              5 5.0 setosa
              6 5.4 setosa





              share|improve this answer













              You could give the names names, and then subset the names before subsetting the data.frame.



              For example, using the iris data:



              short_names <- names(iris)
              names(short_names) <- c("sl","sw","pl","pw","sp")
              attributes(iris)$names <- short_names

              head(iris[names(iris)[c("sl","sp")]])
              Sepal.Length Species
              1 5.1 setosa
              2 4.9 setosa
              3 4.7 setosa
              4 4.6 setosa
              5 5.0 setosa
              6 5.4 setosa






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 27 '18 at 16:22









              JamesJames

              51.5k9118165




              51.5k9118165













              • This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

                – 42-
                Nov 27 '18 at 16:30













              • I like this answer a lot as well. Thank you!

                – thejuanald
                Nov 28 '18 at 16:48



















              • This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

                – 42-
                Nov 27 '18 at 16:30













              • I like this answer a lot as well. Thank you!

                – thejuanald
                Nov 28 '18 at 16:48

















              This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

              – 42-
              Nov 27 '18 at 16:30







              This is really elegant. I thought of using a named vector and match as illustrated by another answeer, but this is elegant hacking into the structure of an object. I'd give 10 upvotes if I could. It's motivating me to investigate how the bounty system works.

              – 42-
              Nov 27 '18 at 16:30















              I like this answer a lot as well. Thank you!

              – thejuanald
              Nov 28 '18 at 16:48





              I like this answer a lot as well. Thank you!

              – thejuanald
              Nov 28 '18 at 16:48











              0














              Using dict and DF defined reproducibly in the Note at the end run the for loop
              shown and then we can use A, B and C without quotes as column names.



              for(i in 1:nrow(dict)) assign(dict$short_name[i], dict$long_name[i])

              # test - use DF[B] in place of DF["Descriptive B"]
              DF[B]
              ## Descriptive B
              ## 1 34


              As shown in the above test it is straight forward when using conventional subscripting. If you want to use nonstandard evaluation such as in dplyr then you will need to use rlang in the usual way:



              library(dplyr)

              DF %>% mutate(D = !!sym(B))
              ## Descriptive A Descriptive B Descriptive C D
              ## 1 12 34 25 34


              Note



              We assume this input:



              Lines1 <- "
              long_name | short_name
              Descriptive A | A
              Descriptive B | B
              Descriptive C | C"
              dict <- read.table(text = Lines1, header = TRUE, sep = "|", as.is = TRUE,
              strip.white = TRUE)

              Lines2 <- "
              Descriptive A | Descriptive B | Descriptive C
              12 | 34 | 25"
              DF <- read.table(text = Lines2, header = TRUE, sep = "|", as.is = TRUE,
              strip.white = TRUE, check.names = FALSE)





              share|improve this answer






























                0














                Using dict and DF defined reproducibly in the Note at the end run the for loop
                shown and then we can use A, B and C without quotes as column names.



                for(i in 1:nrow(dict)) assign(dict$short_name[i], dict$long_name[i])

                # test - use DF[B] in place of DF["Descriptive B"]
                DF[B]
                ## Descriptive B
                ## 1 34


                As shown in the above test it is straight forward when using conventional subscripting. If you want to use nonstandard evaluation such as in dplyr then you will need to use rlang in the usual way:



                library(dplyr)

                DF %>% mutate(D = !!sym(B))
                ## Descriptive A Descriptive B Descriptive C D
                ## 1 12 34 25 34


                Note



                We assume this input:



                Lines1 <- "
                long_name | short_name
                Descriptive A | A
                Descriptive B | B
                Descriptive C | C"
                dict <- read.table(text = Lines1, header = TRUE, sep = "|", as.is = TRUE,
                strip.white = TRUE)

                Lines2 <- "
                Descriptive A | Descriptive B | Descriptive C
                12 | 34 | 25"
                DF <- read.table(text = Lines2, header = TRUE, sep = "|", as.is = TRUE,
                strip.white = TRUE, check.names = FALSE)





                share|improve this answer




























                  0












                  0








                  0







                  Using dict and DF defined reproducibly in the Note at the end run the for loop
                  shown and then we can use A, B and C without quotes as column names.



                  for(i in 1:nrow(dict)) assign(dict$short_name[i], dict$long_name[i])

                  # test - use DF[B] in place of DF["Descriptive B"]
                  DF[B]
                  ## Descriptive B
                  ## 1 34


                  As shown in the above test it is straight forward when using conventional subscripting. If you want to use nonstandard evaluation such as in dplyr then you will need to use rlang in the usual way:



                  library(dplyr)

                  DF %>% mutate(D = !!sym(B))
                  ## Descriptive A Descriptive B Descriptive C D
                  ## 1 12 34 25 34


                  Note



                  We assume this input:



                  Lines1 <- "
                  long_name | short_name
                  Descriptive A | A
                  Descriptive B | B
                  Descriptive C | C"
                  dict <- read.table(text = Lines1, header = TRUE, sep = "|", as.is = TRUE,
                  strip.white = TRUE)

                  Lines2 <- "
                  Descriptive A | Descriptive B | Descriptive C
                  12 | 34 | 25"
                  DF <- read.table(text = Lines2, header = TRUE, sep = "|", as.is = TRUE,
                  strip.white = TRUE, check.names = FALSE)





                  share|improve this answer















                  Using dict and DF defined reproducibly in the Note at the end run the for loop
                  shown and then we can use A, B and C without quotes as column names.



                  for(i in 1:nrow(dict)) assign(dict$short_name[i], dict$long_name[i])

                  # test - use DF[B] in place of DF["Descriptive B"]
                  DF[B]
                  ## Descriptive B
                  ## 1 34


                  As shown in the above test it is straight forward when using conventional subscripting. If you want to use nonstandard evaluation such as in dplyr then you will need to use rlang in the usual way:



                  library(dplyr)

                  DF %>% mutate(D = !!sym(B))
                  ## Descriptive A Descriptive B Descriptive C D
                  ## 1 12 34 25 34


                  Note



                  We assume this input:



                  Lines1 <- "
                  long_name | short_name
                  Descriptive A | A
                  Descriptive B | B
                  Descriptive C | C"
                  dict <- read.table(text = Lines1, header = TRUE, sep = "|", as.is = TRUE,
                  strip.white = TRUE)

                  Lines2 <- "
                  Descriptive A | Descriptive B | Descriptive C
                  12 | 34 | 25"
                  DF <- read.table(text = Lines2, header = TRUE, sep = "|", as.is = TRUE,
                  strip.white = TRUE, check.names = FALSE)






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 27 '18 at 16:18

























                  answered Nov 27 '18 at 16:04









                  G. GrothendieckG. Grothendieck

                  151k10134239




                  151k10134239






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53503133%2fdictionary-of-headers-in-r%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                      Calculate evaluation metrics using cross_val_predict sklearn

                      Insert data from modal to MySQL (multiple modal on website)