У меня есть следующий набор данных, который я создал, чтобы я мог реплицировать мою проблему. У меня есть Module/Filename, которые являются дубликатами.Удалите дублированные значения в одном столбце и возвращайте новое значение в другом столбце
owaspSample <- data.frame(Module=c("AccessDetails.java","AccessDiverse.java","BgField.java","BgStatus.java","CmdDate.java","CmdGameDate.java","CommentDate.java","CostDate.java","EntranceDetails.java","GameDate.java","LdPopDate.java","LeaseCostDate.java","PastApprovalDate.java","ProvisioningDate.java","ReservationDate.java","RefDate.java","ServiceDate.java","StatusDate.java","ProfileDate.java","UpdateCmdDate.java","ViewDate.java","AccessDetails.java","AccessDiverse.java","AuthenticationDate.java","CmdDate.java","CmdSummaryDate.java","CmdViewDate.java","ChangeOrderDate.java","CommentDate.java","CostDate.java","GameDate.java","LdPopDate.java","LeaseCostDate.java","PastApprovalDate.java","ReservationDate.java","RefDate.java","UnderwaterCmdDate.java","WaveDate.java","XmlFormatter.java"),
Category = c("SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","XML External Entity Injection"),
scanDate=c("2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24"),
VulnCount = c("13","15"," 1"," 3","15"," 2","11","30"," 2"," 2"," 2"," 2"," 4"," 2"," 3"," 9"," 1"," 1"," 1"," 8"," 6","25","28"," 3","30"," 1"," 6"," 5","20","23"," 3"," 3"," 4","10"," 3","17"," 1"," 3"," 2"),
Owasp = c("A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A01-Injection"))
Я выполняю следующее, чтобы удалить дубликаты и, похоже, сработает. однако я хочу иметь возможность вернуть дубликат с последней датой. Дата должна быть динамической.
owaspSample <- owaspSample[!duplicated(owaspSample$Module),]
Например, если вы столкнулись с этим:
Module Category Date VulnCount Owasp
CostDate.java SQL Injection 2016-10-23 30 A00-SQL Injection
EntranceDetails.java SQL Injection 2016-10-23 2 A00-SQL Injection
GameDate.java SQL Injection 2016-10-23 2 A00-SQL Injection
CostDate.java SQL Injection 2016-10-24 23 A00-SQL Injection
GameDate.java SQL Injection 2016-10-24 3 A00-SQL Injection
Ожидаемый результат должен быть таким:
Module Category Date VulnCount Owasp
EntranceDetails.java SQL Injection 2016-10-23 2 A00-SQL Injection
CostDate.java SQL Injection 2016-10-24 23 A00-SQL Injection
GameDate.java SQL Injection 2016-10-24 3 A00-SQL Injection
Любые идеи, как сделать это?
Смотрите 'fromLast' аргумент' duplicated'. – nicola
nicola, спасибо. Это работает, по крайней мере, возвращает модули на основе последней даты. однако он удаляет файлы, которые не дублируются. Этот набор данных я показал, что тест работает нормально. Понял, что я делаю неправильно –