Combines two data frames by updating rows in x
with values from y
based on a common key,
and inserting new rows from y
that are not present in x
. The function first harmonizes the
column structures of both data frames by adding missing columns and coercing types as necessary.
Arguments
- x
A data frame to be updated.
- y
A data frame containing new values to update
x
. Must include the column specified bykey
.- key
A character string specifying the unique key column used for matching rows. Defaults to
"key"
.- check.missing
Logical; if
TRUE
, performs a cell-by-cell update only when the new value is not missing. Missing values are defined asNA
for atomic types or an empty list for list columns. IfFALSE
, a standard upsert is performed usingdplyr::rows_upsert
. Defaults toFALSE
.
Details
The function works in several steps:
It computes the union of all column names from
x
andy
and adds any missing columns to both data frames using the internal helper functionAddColumns
. Missing columns are filled with an appropriateNA
value based on their type.Both
x
andy
are reordered to have the same column order.For each common column (excluding the key), if
x
's column is entirelyNA
or if the data types differ, coercion is performed to ensure compatibility betweenx
andy
.When
check.missing
isTRUE
, the function iterates over each common key and updates each cell inx
only if the corresponding cell iny
is not missing. Otherwise, it usesdplyr::rows_upsert
to perform a standard upsert.New rows present in
y
but not inx
are appended.
Examples
if (FALSE) { # \dontrun{
# Example data frames:
df1 <- data.frame(
key = 1:3,
a = c(NA, 2, NA),
b = c("x", NA, "z"),
stringsAsFactors = FALSE
)
df2 <- data.frame(
key = c(2, 3, 4),
a = c(5, 6, 7),
b = c("y", "w", "v"),
stringsAsFactors = FALSE
)
# Standard upsert (check.missing = FALSE):
result <- UpdateInsert(df1, df2, key = "key", check.missing = FALSE)
# Cell-by-cell update (check.missing = TRUE):
result <- UpdateInsert(df1, df2, key = "key", check.missing = TRUE)
} # }