Combines two data frames by updating rows in x with values from y based on a common key,
and inserting new rows from y that are not present in x. The function first harmonizes the
column structures of both data frames by adding missing columns and coercing types as necessary.
Arguments
- x
A data frame to be updated.
- y
A data frame containing new values to update
x. Must include the column specified bykey.- key
A character string specifying the unique key column used for matching rows. Defaults to
"key".- check.missing
Logical; if
TRUE, performs a cell-by-cell update only when the new value is not missing. Missing values are defined asNAfor atomic types or an empty list for list columns. IfFALSE, a standard upsert is performed usingdplyr::rows_upsert. Defaults toFALSE.
Details
The function works in several steps:
It computes the union of all column names from
xandyand adds any missing columns to both data frames using the internal helper functionAddColumns. Missing columns are filled with an appropriateNAvalue based on their type.Both
xandyare reordered to have the same column order.For each common column (excluding the key), if
x's column is entirelyNAor if the data types differ, coercion is performed to ensure compatibility betweenxandy.When
check.missingisTRUE, the function iterates over each common key and updates each cell inxonly if the corresponding cell inyis not missing. Otherwise, it usesdplyr::rows_upsertto perform a standard upsert.New rows present in
ybut not inxare appended.
Examples
if (FALSE) { # \dontrun{
# Example data frames:
df1 <- data.frame(
key = 1:3,
a = c(NA, 2, NA),
b = c("x", NA, "z"),
stringsAsFactors = FALSE
)
df2 <- data.frame(
key = c(2, 3, 4),
a = c(5, 6, 7),
b = c("y", "w", "v"),
stringsAsFactors = FALSE
)
# Standard upsert (check.missing = FALSE):
result <- UpdateInsert(df1, df2, key = "key", check.missing = FALSE)
# Cell-by-cell update (check.missing = TRUE):
result <- UpdateInsert(df1, df2, key = "key", check.missing = TRUE)
} # }